Reported by: Sujith <m.sujith <at> gmail.com>
Date: Mon, 13 Feb 2017 18:41:01 UTC
Severity: normal
Tags: moreinfo
Found in version 26.0.50
Done: Alan Mackenzie <acm <at> muc.de>
Bug is archived. No further changes may be made.
To add a comment to this bug, you must first unarchive it, by sending
a message to control AT debbugs.gnu.org, with unarchive 25706 in the body.
You can then email your comments to 25706 AT debbugs.gnu.org in the normal way.
Toggle the display of automated, internal messages from the tracker.
View this report as an mbox folder, status mbox, maintainer mbox
bug-gnu-emacs <at> gnu.org
:bug#25706
; Package emacs
.
(Mon, 13 Feb 2017 18:41:01 GMT) Full text and rfc822 format available.Sujith <m.sujith <at> gmail.com>
:bug-gnu-emacs <at> gnu.org
.
(Mon, 13 Feb 2017 18:41:02 GMT) Full text and rfc822 format available.Message #5 received at submit <at> debbugs.gnu.org (full text, mbox):
From: Sujith <m.sujith <at> gmail.com> To: bug-gnu-emacs <at> gnu.org Subject: 26.0.50; Slow C file fontification Date: Mon, 13 Feb 2017 23:50:20 +0530
On a machine that is not very high-powered, opening some C files and trying to edit/view them is very slow. For example: https://raw.githubusercontent.com/qca/qcamain_open_hal_public/master/hal/ar9300/osprey_reg_map_macro.h This is a large file and filled with macros. Is there any way to view this without disabling font-lock entirely ? I am using the master branch and I have these in my .emacs: (global-font-lock-mode t) (setq font-lock-maximum-decoration (quote ((c-mode . 2) (c++-mode . 2) (t . t)))) (setq c-font-lock-extra-types (quote ("\\sw+_t" "bool" "complex" "imaginary" "FILE" "lconv" "tm" "va_list" "jmp_buf" "Lisp_Object" "u8" "u16" "u32" "u64" "s8" "s16" "s32" "s64" "__le16" "__le32" "__le64" "__be16" "__be32" "__be64" "__s8" "__s16" "__s32" "__s64" "__u8" "__u16" "__u32" "__u64"))) The machine is a low-end 10-inch netbook. Some details: $ uname -a Linux the-damned 4.10.0-rc7-wt #16 SMP PREEMPT Tue Feb 7 10:47:38 IST 2017 x86_64 GNU/Linux $ free -m -h total used free shared buff/cache available Mem: 1.8G 542M 560M 87M 766M 1.0G Swap: 2.0G 24M 2.0G $ cat /proc/cpuinfo processor : 0 vendor_id : GenuineIntel cpu family : 6 model : 55 model name : Intel(R) Celeron(R) CPU N2807 @ 1.58GHz stepping : 8 microcode : 0x811 cpu MHz : 1828.644 cache size : 1024 KB processor : 1 vendor_id : GenuineIntel cpu family : 6 model : 55 model name : Intel(R) Celeron(R) CPU N2807 @ 1.58GHz stepping : 8 microcode : 0x811 cpu MHz : 1805.267 cache size : 1024 KB $ gcc -v Using built-in specs. COLLECT_GCC=gcc COLLECT_LTO_WRAPPER=/usr/lib/gcc/x86_64-pc-linux-gnu/6.3.1/lto-wrapper Target: x86_64-pc-linux-gnu Configured with: /build/gcc-multilib/src/gcc/configure --prefix=/usr --libdir=/usr/lib --libexecdir=/usr/lib --mandir=/usr/share/man --infodir=/usr/share/info --with-bugurl=https://bugs.archlinux.org/ --enable-languages=c,c++,ada,fortran,go,lto,objc,obj-c++ --enable-shared --enable-threads=posix --enable-libmpx --with-system-zlib --with-isl --enable-__cxa_atexit --disable-libunwind-exceptions --enable-clocale=gnu --disable-libstdcxx-pch --disable-libssp --enable-gnu-unique-object --enable-linker-build-id --enable-lto --enable-plugin --enable-install-libiberty --with-linker-hash-style=gnu --enable-gnu-indirect-function --enable-multilib --disable-werror --enable-checking=release Thread model: posix gcc version 6.3.1 20170109 (GCC) In GNU Emacs 26.0.50.1 (x86_64-unknown-linux-gnu, GTK+ Version 3.22.7) of 2017-02-13 built on the-damned Repository revision: 271dcf8652ccf94d8582b2bcdb26f066d0b946a2 Windowing system distributor 'The X.Org Foundation', version 11.0.11901000 Recent messages: Checking 57 files in /usr/share/emacs/26.0.50/lisp/eshell... Checking 70 files in /usr/share/emacs/26.0.50/lisp/erc... Checking 34 files in /usr/share/emacs/26.0.50/lisp/emulation... Checking 172 files in /usr/share/emacs/26.0.50/lisp/emacs-lisp... Checking 24 files in /usr/share/emacs/26.0.50/lisp/cedet... Checking 57 files in /usr/share/emacs/26.0.50/lisp/calendar... Checking 87 files in /usr/share/emacs/26.0.50/lisp/calc... Checking 103 files in /usr/share/emacs/26.0.50/lisp/obsolete... Checking for load-path shadows...done Message modified; kill anyway? (y or n) y Configured using: 'configure --prefix=/usr --without-libsystemd --without-dbus --without-gconf --without-gsettings --without-selinux --without-threads --without-gpm --without-xaw3d --without-toolkit-scroll-bars --without-m17n-flt --without-libotf --without-imagemagick --without-rsvg --without-png --without-gif --without-tiff --without-jpeg --without-xpm --with-sound=no CFLAGS=-O3' Configured features: NOTIFY ACL GNUTLS LIBXML2 FREETYPE XFT ZLIB GTK3 X11 Important settings: value of $LANG: en_IN.UTF-8 locale-coding-system: utf-8-unix Major mode: C Minor modes in effect: global-magit-file-mode: t magit-file-mode: t diff-auto-refine-mode: t magit-auto-revert-mode: t auto-revert-mode: t global-git-commit-mode: t async-bytecomp-package-mode: t shell-dirtrack-mode: t display-battery-mode: t display-time-mode: t iswitchb-mode: t savehist-mode: t save-place-mode: t tooltip-mode: t global-eldoc-mode: t mouse-wheel-mode: t file-name-shadow-mode: t global-font-lock-mode: t font-lock-mode: t blink-cursor-mode: t auto-composition-mode: t auto-encryption-mode: t auto-compression-mode: t column-number-mode: 1 line-number-mode: t transient-mark-mode: t abbrev-mode: t Load-path shadows: /home/sujith/.emacs.d/elpa/emms-20160304.920/tq hides /usr/share/emacs/26.0.50/lisp/emacs-lisp/tq Features: (cc-mode cc-fonts cc-guess cc-menus cc-cmds cc-styles cc-align cc-engine cc-vars cc-defs pp shadow flyspell ispell face-remap emacsbug ibuf-ext ibuffer ibuffer-loaddefs w3m-form w3m-filter w3m-bookmark w3m-tabmenu w3m-session ffap w3m timezone w3m-hist w3m-fb bookmark-w3m w3m-ems wid-edit w3m-ccl ccl w3m-favicon w3m-image w3m-proc w3m-util dired-aux magit-obsolete magit-blame magit-stash magit-bisect magit-remote magit-commit magit-sequence magit-notes magit-worktree magit-branch magit-files magit-refs magit-status magit magit-repos magit-apply magit-wip magit-log magit-diff smerge-mode diff-mode magit-core magit-autorevert autorevert filenotify magit-process magit-margin magit-mode magit-git crm magit-section magit-popup git-commit magit-utils log-edit easy-mmode pcvs-util add-log with-editor async-bytecomp async tramp-sh tramp tramp-compat tramp-loaddefs trampver ucs-normalize shell pcomplete parse-time dash advice mu4e-contrib mu4e desktop frameset mu4e-speedbar speedbar sb-image ezimage dframe mu4e-main mu4e-context mu4e-view cal-menu calendar cal-loaddefs thingatpt browse-url comint ansi-color mu4e-headers mu4e-compose mu4e-draft mu4e-actions ido rfc2368 smtpmail sendmail mu4e-mark mu4e-message flow-fill html2text mu4e-proc mu4e-proc-mu mu4e-utils doc-view jka-compr image-mode mu4e-lists mu4e-vars message puny format-spec rfc822 mml mml-sec epa derived epg gnus-util rmail rmail-loaddefs mm-decode mm-bodies mm-encode mail-parse rfc2231 rfc2047 rfc2045 mm-util ietf-drums mail-prsvr mailabbrev mail-utils gmm-utils mailheader hl-line cl mu4e-meta battery time dired-x dired dired-loaddefs edmacro kmacro xcscope ring server iswitchb savehist saveplace finder-inf info package epg-config url-handlers url-parse auth-source cl-seq eieio eieio-core cl-macs eieio-loaddefs password-cache url-vars seq byte-opt subr-x gv bytecomp byte-compile cl-extra help-mode easymenu cconv cl-loaddefs pcase cl-lib time-date mule-util tooltip eldoc electric uniquify ediff-hook vc-hooks lisp-float-type mwheel term/x-win x-win term/common-win x-dnd tool-bar dnd fontset image regexp-opt fringe tabulated-list replace newcomment text-mode elisp-mode lisp-mode prog-mode register page menu-bar rfn-eshadow isearch timer select scroll-bar mouse jit-lock font-lock syntax facemenu font-core term/tty-colors frame cl-generic cham georgian utf-8-lang misc-lang vietnamese tibetan thai tai-viet lao korean japanese eucjp-ms cp51932 hebrew greek romanian slovak czech european ethiopic indian cyrillic chinese composite charscript case-table epa-hook jka-cmpr-hook help simple abbrev obarray minibuffer cl-preloaded nadvice loaddefs button faces cus-face macroexp files text-properties overlay sha1 md5 base64 format env code-pages mule custom widget hashtable-print-readable backquote inotify dynamic-setting font-render-setting move-toolbar gtk x-toolkit x multi-tty make-network-process emacs) Memory information: ((conses 16 1112940 3463) (symbols 48 36857 5) (miscs 40 96 357) (strings 32 73343 16247) (string-bytes 1 2484237) (vectors 16 59639) (vector-slots 8 1342999 94842) (floats 8 475 158) (intervals 56 127752 56) (buffers 976 21))
bug-gnu-emacs <at> gnu.org, bug-cc-mode <at> gnu.org
:bug#25706
; Package emacs,cc-mode
.
(Mon, 30 Nov 2020 11:27:01 GMT) Full text and rfc822 format available.Message #8 received at 25706 <at> debbugs.gnu.org (full text, mbox):
From: Lars Ingebrigtsen <larsi <at> gnus.org> To: Sujith <m.sujith <at> gmail.com> Cc: 25706 <at> debbugs.gnu.org Subject: Re: bug#25706: 26.0.50; Slow C file fontification Date: Mon, 30 Nov 2020 12:26:18 +0100
Sujith <m.sujith <at> gmail.com> writes: > On a machine that is not very high-powered, opening some C files > and trying to edit/view them is very slow. > > For example: > https://raw.githubusercontent.com/qca/qcamain_open_hal_public/master/hal/ar9300/osprey_reg_map_macro.h > > This is a large file and filled with macros. > Is there any way to view this without disabling font-lock entirely ? (This bug report unfortunately got no response at the time.) I tried reproducing this on a pretty new laptop, and opening the file in question (with your settings) took less than a second with Emacs 28. You say "very slow", but not what kind of time scale you mean -- one second or one minute or something. Are you still seeing this issue in more recent versions of Emacs? -- (domestic pets only, the antidote for overdose, milk.) bloggy blog: http://lars.ingebrigtsen.no
Lars Ingebrigtsen <larsi <at> gnus.org>
to control <at> debbugs.gnu.org
.
(Mon, 30 Nov 2020 11:27:02 GMT) Full text and rfc822 format available.bug-gnu-emacs <at> gnu.org, bug-cc-mode <at> gnu.org
:bug#25706
; Package emacs,cc-mode
.
(Mon, 30 Nov 2020 11:38:02 GMT) Full text and rfc822 format available.Message #13 received at 25706 <at> debbugs.gnu.org (full text, mbox):
From: Lars Ingebrigtsen <larsi <at> gnus.org> To: 25706 <at> debbugs.gnu.org Subject: Re: bug#25706: 26.0.50; Slow C file fontification Date: Mon, 30 Nov 2020 12:37:29 +0100
Lars Ingebrigtsen <larsi <at> gnus.org> writes: > Are you still seeing this issue in more recent versions of Emacs? The mail bounced, so I guess it's unlikely to be any further progress in this bug report, and I'm closing it. -- (domestic pets only, the antidote for overdose, milk.) bloggy blog: http://lars.ingebrigtsen.no
Lars Ingebrigtsen <larsi <at> gnus.org>
to control <at> debbugs.gnu.org
.
(Mon, 30 Nov 2020 11:38:02 GMT) Full text and rfc822 format available.bug-gnu-emacs <at> gnu.org, bug-cc-mode <at> gnu.org
:bug#25706
; Package emacs,cc-mode
.
(Mon, 30 Nov 2020 12:47:02 GMT) Full text and rfc822 format available.Message #18 received at 25706 <at> debbugs.gnu.org (full text, mbox):
From: Mattias Engdegård <mattiase <at> acm.org> To: 25706 <at> debbugs.gnu.org Cc: Alan Mackenzie <acm <at> muc.de>, Lars Ingebrigtsen <larsi <at> gnus.org> Subject: Re: bug#25706: 26.0.50; Slow C file fontification Date: Mon, 30 Nov 2020 13:46:30 +0100
>> https://raw.githubusercontent.com/qca/qcamain_open_hal_public/master/hal/ar9300/osprey_reg_map_macro.h > > I tried reproducing this on a pretty new laptop, and opening the file in > question (with your settings) took less than a second with Emacs 28. My lappy is less new but not really that slow -- compared to the hardware of the original reporter it's a speed demon -- but opening the file takes almost 4 s here. More importantly, scrolling through the file is painfully slow. The code in the file is nothing out of the ordinary; it consists of macros that are 1-3 lines each; definitely not a pathological case. The entire fontification takes 64 s for this file. I'd say the complaint is warranted, even if the original reporter is no longer reachable. Reopen? Alan, do you have a diagnose?
bug-gnu-emacs <at> gnu.org, bug-cc-mode <at> gnu.org
:bug#25706
; Package emacs,cc-mode
.
(Mon, 30 Nov 2020 12:50:01 GMT) Full text and rfc822 format available.Message #21 received at 25706 <at> debbugs.gnu.org (full text, mbox):
From: Lars Ingebrigtsen <larsi <at> gnus.org> To: Mattias Engdegård <mattiase <at> acm.org> Cc: Alan Mackenzie <acm <at> muc.de>, 25706 <at> debbugs.gnu.org Subject: Re: bug#25706: 26.0.50; Slow C file fontification Date: Mon, 30 Nov 2020 13:49:09 +0100
Mattias Engdegård <mattiase <at> acm.org> writes: > I'd say the complaint is warranted, even if the original reporter is > no longer reachable. Reopen? OK; reopening. -- (domestic pets only, the antidote for overdose, milk.) bloggy blog: http://lars.ingebrigtsen.no
Debbugs Internal Request <help-debbugs <at> gnu.org>
to internal_control <at> debbugs.gnu.org
.
(Mon, 30 Nov 2020 12:50:02 GMT) Full text and rfc822 format available.bug-gnu-emacs <at> gnu.org, bug-cc-mode <at> gnu.org
:bug#25706
; Package emacs,cc-mode
.
(Mon, 30 Nov 2020 16:28:01 GMT) Full text and rfc822 format available.Message #26 received at 25706 <at> debbugs.gnu.org (full text, mbox):
From: Eli Zaretskii <eliz <at> gnu.org> To: Mattias Engdegård <mattiase <at> acm.org> Cc: acm <at> muc.de, larsi <at> gnus.org, 25706 <at> debbugs.gnu.org Subject: Re: bug#25706: 26.0.50; Slow C file fontification Date: Mon, 30 Nov 2020 18:27:19 +0200
> From: Mattias Engdegård <mattiase <at> acm.org> > Date: Mon, 30 Nov 2020 13:46:30 +0100 > Cc: Alan Mackenzie <acm <at> muc.de>, Lars Ingebrigtsen <larsi <at> gnus.org> > > >> https://raw.githubusercontent.com/qca/qcamain_open_hal_public/master/hal/ar9300/osprey_reg_map_macro.h > > > > I tried reproducing this on a pretty new laptop, and opening the file in > > question (with your settings) took less than a second with Emacs 28. > > My lappy is less new but not really that slow -- compared to the hardware of the original reporter it's a speed demon -- > but opening the file takes almost 4 s here. More importantly, scrolling through the file is painfully slow. > > The code in the file is nothing out of the ordinary; it consists of macros that are 1-3 lines each; definitely not a pathological case. The entire fontification takes 64 s for this file. > > I'd say the complaint is warranted, even if the original reporter is no longer reachable. Reopen? > > Alan, do you have a diagnose? I suggest to run this under "M-x profiler-start" and post the fully expanded profile you get from that. Bonus points for doing that after loading the CC Mode files as .el (not .elc), which will make the profile more detailed.
bug-gnu-emacs <at> gnu.org, bug-cc-mode <at> gnu.org
:bug#25706
; Package emacs,cc-mode
.
(Mon, 30 Nov 2020 16:39:01 GMT) Full text and rfc822 format available.Message #29 received at 25706 <at> debbugs.gnu.org (full text, mbox):
From: Alan Mackenzie <acm <at> muc.de> To: Mattias Engdegård <mattiase <at> acm.org> Cc: Lars Ingebrigtsen <larsi <at> gnus.org>, 25706 <at> debbugs.gnu.org Subject: Re: bug#25706: 26.0.50; Slow C file fontification Date: Mon, 30 Nov 2020 16:38:29 +0000
Hello, Mattias. On Mon, Nov 30, 2020 at 13:46:30 +0100, Mattias Engdegård wrote: > >> https://raw.githubusercontent.com/qca/qcamain_open_hal_public/master/hal/ar9300/osprey_reg_map_macro.h > > I tried reproducing this on a pretty new laptop, and opening the file > > in question (with your settings) took less than a second with Emacs > > 28. > My lappy is less new but not really that slow -- compared to the > hardware of the original reporter it's a speed demon -- but opening the > file takes almost 4 s here. More importantly, scrolling through the > file is painfully slow. > The code in the file is nothing out of the ordinary; it consists of > macros that are 1-3 lines each; definitely not a pathological case. The > entire fontification takes 64 s for this file. > I'd say the complaint is warranted, even if the original reporter is no > longer reachable. Reopen? > Alan, do you have a diagnose? Yes. I've had a look at the file, and it's large and lacking in braces. There are functions in CC Mode which search backwards for opening braces to establish context. When there are none, the search goes back to BOB. Lots of these searches, not efficiently cached, take a long time. It's a problem with CC Mode, not with the source file. It's a known problem, and not easy to fix. -- Alan Mackenzie (Nuremberg, Germany).
bug-gnu-emacs <at> gnu.org, bug-cc-mode <at> gnu.org
:bug#25706
; Package emacs,cc-mode
.
(Mon, 30 Nov 2020 16:54:02 GMT) Full text and rfc822 format available.Message #32 received at 25706 <at> debbugs.gnu.org (full text, mbox):
From: Mattias Engdegård <mattiase <at> acm.org> To: Alan Mackenzie <acm <at> muc.de> Cc: Lars Ingebrigtsen <larsi <at> gnus.org>, 25706 <at> debbugs.gnu.org Subject: Re: bug#25706: 26.0.50; Slow C file fontification Date: Mon, 30 Nov 2020 17:53:04 +0100
30 nov. 2020 kl. 17.38 skrev Alan Mackenzie <acm <at> muc.de>: > Yes. I've had a look at the file, and it's large and lacking in braces. > There are functions in CC Mode which search backwards for opening braces > to establish context. When there are none, the search goes back to BOB. > Lots of these searches, not efficiently cached, take a long time. > > It's a problem with CC Mode, not with the source file. It's a known > problem, and not easy to fix. Actually, it's the underscores! Demo: fill a file with the line pairs #define abc_defg_hij_klm__nop_qrst_uvw_xyz_w__ooa_cin_e__aoi__uynv(s) \ 0 repeated 1000 times, thus making it 2000 lines. Save as something.h. Slow! Now replace each underscore with a letter. Save. Fast! Fontifying the 2000 line file (with underscores) takes longer than the original 80000 line file. I started going through c-find-decl-spots and c-find-decl-prefix-search (together there are while statements nested 4 deep) but am not sure exactly where the trouble is. A regexp? Something syntax-char related (since '_' has symbol syntax, not word)? CC-mode in general thrashes the regexp cache; the miss rate is at 27 % for the original file, which is way too high. Enlarging the cache enough to eliminate misses helps, but not nearly enough.
bug-gnu-emacs <at> gnu.org, bug-cc-mode <at> gnu.org
:bug#25706
; Package emacs,cc-mode
.
(Mon, 30 Nov 2020 17:06:02 GMT) Full text and rfc822 format available.Message #35 received at 25706 <at> debbugs.gnu.org (full text, mbox):
From: Mattias Engdegård <mattiase <at> acm.org> To: Alan Mackenzie <acm <at> muc.de> Cc: Lars Ingebrigtsen <larsi <at> gnus.org>, 25706 <at> debbugs.gnu.org Subject: Re: bug#25706: 26.0.50; Slow C file fontification Date: Mon, 30 Nov 2020 18:04:56 +0100
[Message part 1 (text/plain, inline)]
> Actually, it's the underscores! Found it. Suggested fix attached. It can be improved: at least one pair of regexp group brackets can be removed, but I didn't dare doing so because I wasn't sure if it would throw some group numbers off by one. Alan, please, let's work together and remove unnecessary capture groups from the regexps! Even XEmacs regexps support non-capturing brackets, \(?:...\), and they save time, regexp stack space, and reduce the hassle of computing the 'regexp depth' everywhere.
[cc-underscores.diff (application/octet-stream, attachment)]
bug-gnu-emacs <at> gnu.org, bug-cc-mode <at> gnu.org
:bug#25706
; Package emacs,cc-mode
.
(Mon, 30 Nov 2020 18:31:01 GMT) Full text and rfc822 format available.Message #38 received at 25706 <at> debbugs.gnu.org (full text, mbox):
From: Alan Mackenzie <acm <at> muc.de> To: Mattias Engdegård <mattiase <at> acm.org> Cc: acm <at> muc.de, Lars Ingebrigtsen <larsi <at> gnus.org>, 25706 <at> debbugs.gnu.org Subject: Re: bug#25706: 26.0.50; Slow C file fontification Date: Mon, 30 Nov 2020 18:30:16 +0000
Hello, Mattias. On Mon, Nov 30, 2020 at 13:46:30 +0100, Mattias Engdegård wrote: > >> https://raw.githubusercontent.com/qca/qcamain_open_hal_public/master/hal/ar9300/osprey_reg_map_macro.h > > I tried reproducing this on a pretty new laptop, and opening the file > > in question (with your settings) took less than a second with Emacs > > 28. > My lappy is less new but not really that slow -- compared to the > hardware of the original reporter it's a speed demon -- but opening the > file takes almost 4 s here. More importantly, scrolling through the > file is painfully slow. Hah! I just tried it, all the way through the file, and it took me 3568.429881811142 seconds, i.e. all of an hour bar 32 seconds. My machine is no way slow, being a first generation AMD Ryzen from 2017. > Alan, do you have a diagnose? Other than what I told you last post (lack of braces), not yet, but I'm going to take the first tenth of the OP's file (which is 4 Mb) for testing on. -- Alan Mackenzie (Nuremberg, Germany).
bug-gnu-emacs <at> gnu.org, bug-cc-mode <at> gnu.org
:bug#25706
; Package emacs,cc-mode
.
(Tue, 01 Dec 2020 05:50:02 GMT) Full text and rfc822 format available.Message #41 received at 25706 <at> debbugs.gnu.org (full text, mbox):
From: Ravine Var <ravine.var <at> gmail.com> To: Mattias Engdegård <mattiase <at> acm.org> Cc: Alan Mackenzie <acm <at> muc.de>, Lars Ingebrigtsen <larsi <at> gnus.org>, 25706 <at> debbugs.gnu.org Subject: Re: bug#25706: 26.0.50; Slow C file fontification Date: Tue, 01 Dec 2020 11:18:40 +0530
Mattias Engdegård <mattiase <at> acm.org> writes: > Found it. Suggested fix attached. > > It can be improved: at least one pair of regexp group brackets can be > removed, but I didn't dare doing so because I wasn't sure if it would > throw some group numbers off by one. Thanks for working on this ! Will this patch fix the problem with big header files like the one originally reported ? I tested this patch and the issue is still there. Also, such header files are very common. For example: https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/tree/drivers/gpu/drm/amd/include/asic_reg
bug-gnu-emacs <at> gnu.org, bug-cc-mode <at> gnu.org
:bug#25706
; Package emacs,cc-mode
.
(Tue, 01 Dec 2020 09:22:01 GMT) Full text and rfc822 format available.Message #44 received at 25706 <at> debbugs.gnu.org (full text, mbox):
From: Alan Mackenzie <acm <at> muc.de> To: Mattias Engdegård <mattiase <at> acm.org> Cc: Lars Ingebrigtsen <larsi <at> gnus.org>, 25706 <at> debbugs.gnu.org Subject: Re: bug#25706: 26.0.50; Slow C file fontification Date: Tue, 1 Dec 2020 09:21:09 +0000
Hello, Mattias. On Mon, Nov 30, 2020 at 17:53:04 +0100, Mattias Engdegård wrote: > 30 nov. 2020 kl. 17.38 skrev Alan Mackenzie <acm <at> muc.de>: > > Yes. I've had a look at the file, and it's large and lacking in > > braces. There are functions in CC Mode which search backwards for > > opening braces to establish context. When there are none, the > > search goes back to BOB. Lots of these searches, not efficiently > > cached, take a long time. > > It's a problem with CC Mode, not with the source file. It's a known > > problem, and not easy to fix. > Actually, it's the underscores! > Demo: fill a file with the line pairs > #define abc_defg_hij_klm__nop_qrst_uvw_xyz_w__ooa_cin_e__aoi__uynv(s) \ > 0 > repeated 1000 times, thus making it 2000 lines. Save as something.h. Slow! > Now replace each underscore with a letter. Save. Fast! > Fontifying the 2000 line file (with underscores) takes longer than the > original 80000 line file. Hey, wonderful! I haven't tried it yet, but I did try this: (i) Take the first 10% of the original 4MB file, and save it in a different file. (ii) Fontify that file from top to bottom: according to EPL, 292s (iii) Insert 9 new lines "{}" every 10% of that new file. (iv) Fontify the amended file top to bottom: new time 98s. That's a factor of 3 different. > I started going through c-find-decl-spots and > c-find-decl-prefix-search (together there are while statements nested > 4 deep) but am not sure exactly where the trouble is. A regexp? > Something syntax-char related (since '_' has symbol syntax, not word)? > CC-mode in general thrashes the regexp cache; the miss rate is at 27 % > for the original file, which is way too high. Enlarging the cache > enough to eliminate misses helps, but not nearly enough. So, you reckon replacing "\\(" by "\\(?:" wherever the first isn't really needed would make a big difference? Have I understood you right? If so, I've got a big job ahead of me, going through all the regexps in CC Mode doing the replacement, and fixing all the match_begininings and match_ends, and so on, which depend on them. -- Alan Mackenzie (Nuremberg, Germany).
bug-gnu-emacs <at> gnu.org, bug-cc-mode <at> gnu.org
:bug#25706
; Package emacs,cc-mode
.
(Tue, 01 Dec 2020 09:31:01 GMT) Full text and rfc822 format available.Message #47 received at 25706 <at> debbugs.gnu.org (full text, mbox):
From: Alan Mackenzie <acm <at> muc.de> To: Mattias Engdegård <mattiase <at> acm.org> Cc: Lars Ingebrigtsen <larsi <at> gnus.org>, 25706 <at> debbugs.gnu.org Subject: Re: bug#25706: 26.0.50; Slow C file fontification Date: Tue, 1 Dec 2020 09:29:52 +0000
Hello again, Mattias. On Mon, Nov 30, 2020 at 18:04:56 +0100, Mattias Engdegård wrote: > > Actually, it's the underscores! > Found it. Suggested fix attached. > It can be improved: at least one pair of regexp group brackets can be > removed, but I didn't dare doing so because I wasn't sure if it would > throw some group numbers off by one. > Alan, please, let's work together and remove unnecessary capture groups > from the regexps! Even XEmacs regexps support non-capturing brackets, > \(?:...\), and they save time, regexp stack space, and reduce the > hassle of computing the 'regexp depth' everywhere. There are 342 occurrences of '\\\\([^?]' in CC Mode. Most of these can surely be replaced by "\\(?:", but not all, by a long way. This change will be fun. -- Alan Mackenzie (Nuremberg, Germany).
bug-gnu-emacs <at> gnu.org, bug-cc-mode <at> gnu.org
:bug#25706
; Package emacs,cc-mode
.
(Tue, 01 Dec 2020 09:45:01 GMT) Full text and rfc822 format available.Message #50 received at 25706 <at> debbugs.gnu.org (full text, mbox):
From: martin rudalics <rudalics <at> gmx.at> To: Alan Mackenzie <acm <at> muc.de>, Mattias Engdegård <mattiase <at> acm.org> Cc: Lars Ingebrigtsen <larsi <at> gnus.org>, 25706 <at> debbugs.gnu.org Subject: Re: bug#25706: 26.0.50; Slow C file fontification Date: Tue, 1 Dec 2020 10:44:31 +0100
[Message part 1 (text/plain, inline)]
> There are 342 occurrences of '\\\\([^?]' in CC Mode. Most of these can > surely be replaced by "\\(?:", but not all, by a long way. This change > will be fun. Years ago I wrote the attached that might help you in this regard (load it and do 'turn-on-regexp-lock-mode'). If you move point before the "(" of a "\\(" it should give you the appropriate nesting. martin
[regexp-lock.el (text/x-emacs-lisp, attachment)]
bug-gnu-emacs <at> gnu.org, bug-cc-mode <at> gnu.org
:bug#25706
; Package emacs,cc-mode
.
(Tue, 01 Dec 2020 10:08:01 GMT) Full text and rfc822 format available.Message #53 received at 25706 <at> debbugs.gnu.org (full text, mbox):
From: Alan Mackenzie <acm <at> muc.de> To: martin rudalics <rudalics <at> gmx.at> Cc: Mattias Engdegård <mattiase <at> acm.org>, Lars Ingebrigtsen <larsi <at> gnus.org>, 25706 <at> debbugs.gnu.org Subject: Re: bug#25706: 26.0.50; Slow C file fontification Date: Tue, 1 Dec 2020 10:07:08 +0000
Hello, Martin. On Tue, Dec 01, 2020 at 10:44:31 +0100, martin rudalics wrote: > > There are 342 occurrences of '\\\\([^?]' in CC Mode. Most of these can > > surely be replaced by "\\(?:", but not all, by a long way. This change > > will be fun. > Years ago I wrote the attached that might help you in this regard (load > it and do 'turn-on-regexp-lock-mode'). If you move point before the "(" > of a "\\(" it should give you the appropriate nesting. Thanks! I'll have a look at it. > martin -- Alan Mackenzie (Nuremberg, Germany).
bug-gnu-emacs <at> gnu.org, bug-cc-mode <at> gnu.org
:bug#25706
; Package emacs,cc-mode
.
(Tue, 01 Dec 2020 12:04:01 GMT) Full text and rfc822 format available.Message #56 received at 25706 <at> debbugs.gnu.org (full text, mbox):
From: Mattias Engdegård <mattiase <at> acm.org> To: Alan Mackenzie <acm <at> muc.de> Cc: Lars Ingebrigtsen <larsi <at> gnus.org>, 25706 <at> debbugs.gnu.org Subject: Re: bug#25706: 26.0.50; Slow C file fontification Date: Tue, 1 Dec 2020 13:03:21 +0100
1 dec. 2020 kl. 10.21 skrev Alan Mackenzie <acm <at> muc.de>: > (i) Take the first 10% of the original 4MB file, and save it in a > different file. > (ii) Fontify that file from top to bottom: according to EPL, 292s > (iii) Insert 9 new lines "{}" every 10% of that new file. > (iv) Fontify the amended file top to bottom: new time 98s. > > That's a factor of 3 different. Thank you, quite remarkable and a very useful piece of information! Please let me curb some unwarranted optimism that I'm guilty of engendering: We have been measuring slightly different things. Being lazy, I timed the fontification in one go: (font-lock-ensure (point-min) (point-max)) which took about 65 s originally and went down to about 24 s by fixing the regexps as previously mentioned. Much better but still not wonderful. You have measured interactive scrolling which is more realistic, but fontifying the buffer piecemeal it exercises slightly different code paths. Fixing those regexps helps but not as much, and clearly more work is needed. (By the way, could you direct me to your benchmark code? I don't think I have it.) Still, improving regexps is clearly beneficial. Reducing allocation can be effective as well; a fair bit of the profile is in the GC.
bug-gnu-emacs <at> gnu.org, bug-cc-mode <at> gnu.org
:bug#25706
; Package emacs,cc-mode
.
(Tue, 01 Dec 2020 12:58:01 GMT) Full text and rfc822 format available.Message #59 received at 25706 <at> debbugs.gnu.org (full text, mbox):
From: Alan Mackenzie <acm <at> muc.de> To: Mattias Engdegård <mattiase <at> acm.org> Cc: Lars Ingebrigtsen <larsi <at> gnus.org>, 25706 <at> debbugs.gnu.org Subject: Re: bug#25706: 26.0.50; Slow C file fontification Date: Tue, 1 Dec 2020 12:57:34 +0000
Hello, Mattias. On Tue, Dec 01, 2020 at 13:03:21 +0100, Mattias Engdegård wrote: > 1 dec. 2020 kl. 10.21 skrev Alan Mackenzie <acm <at> muc.de>: > > (i) Take the first 10% of the original 4MB file, and save it in a > > different file. > > (ii) Fontify that file from top to bottom: according to EPL, 292s > > (iii) Insert 9 new lines "{}" every 10% of that new file. > > (iv) Fontify the amended file top to bottom: new time 98s. > > That's a factor of 3 different. > Thank you, quite remarkable and a very useful piece of information! > Please let me curb some unwarranted optimism that I'm guilty of > engendering: > We have been measuring slightly different things. Being lazy, I timed > the fontification in one go: > (font-lock-ensure (point-min) (point-max)) > which took about 65 s originally and went down to about 24 s by fixing > the regexps as previously mentioned. Much better but still not > wonderful. > You have measured interactive scrolling which is more realistic, but > fontifying the buffer piecemeal it exercises slightly different code > paths. Fixing those regexps helps but not as much, and clearly more > work is needed. > (By the way, could you direct me to your benchmark code? I don't think > I have it.) Just something I threw together a few years ago, and use regularly on xdisp.c to check nothing's gone seriously slow/see how well my latest optimisation has worked. (defmacro time-it (&rest forms) "Time the running of a sequence of forms using `float-time'. Call like this: \"M-: (time-it (foo ...) (bar ...) ...)\"." `(let ((start (float-time))) ,@forms (- (float-time) start))) (defun time-scroll (&optional arg) (interactive "P") (message "%s" (time-it (condition-case nil (while t (if arg (scroll-down) (scroll-up)) (sit-for 0)) (error nil))))) Put point at the start or end of a buffer and do M-: (time-scroll) or M-: (time-scroll t) as appropriate. > Still, improving regexps is clearly beneficial. Reducing allocation can > be effective as well; a fair bit of the profile is in the GC. How much time does this regexp change save on a "normal" file, such as src/xdisp.c? -- Alan Mackenzie (Nuremberg, Germany).
bug-gnu-emacs <at> gnu.org, bug-cc-mode <at> gnu.org
:bug#25706
; Package emacs,cc-mode
.
(Tue, 01 Dec 2020 13:36:02 GMT) Full text and rfc822 format available.Message #62 received at 25706 <at> debbugs.gnu.org (full text, mbox):
From: Mattias Engdegård <mattiase <at> acm.org> To: Ravine Var <ravine.var <at> gmail.com> Cc: Alan Mackenzie <acm <at> muc.de>, Lars Ingebrigtsen <larsi <at> gnus.org>, 25706 <at> debbugs.gnu.org Subject: Re: bug#25706: 26.0.50; Slow C file fontification Date: Tue, 1 Dec 2020 14:34:58 +0100
1 dec. 2020 kl. 06.48 skrev Ravine Var <ravine.var <at> gmail.com>: > Will this patch fix the problem with big header files like > the one originally reported ? Unfortunately it seems that my benchmarking approach was misleading; see my previous reply to Alan. Sorry about that. The patch helps a bit but not nearly enough, so for big header files like the ones you mention in the asic_reg directory, it may not make much of a difference. It is obviously worthwhile, but again as Alan noted, the incremental fontifying cost increases with distance from the start of the file (absent any actual code other than preprocessor definitions), leading to the observed superlinear behaviour. More robust heuristics needed.
bug-gnu-emacs <at> gnu.org, bug-cc-mode <at> gnu.org
:bug#25706
; Package emacs,cc-mode
.
(Tue, 01 Dec 2020 14:08:02 GMT) Full text and rfc822 format available.Message #65 received at 25706 <at> debbugs.gnu.org (full text, mbox):
From: Mattias Engdegård <mattiase <at> acm.org> To: Alan Mackenzie <acm <at> muc.de> Cc: Lars Ingebrigtsen <larsi <at> gnus.org>, 25706 <at> debbugs.gnu.org Subject: Re: bug#25706: 26.0.50; Slow C file fontification Date: Tue, 1 Dec 2020 15:07:02 +0100
1 dec. 2020 kl. 13.57 skrev Alan Mackenzie <acm <at> muc.de>: > Just something I threw together a few years ago, and use regularly on > xdisp.c to check nothing's gone seriously slow/see how well my latest > optimisation has worked. Thank you, good, I just wanted to know that we are measuring the same thing! > How much time does this regexp change save on a "normal" file, such as > src/xdisp.c? Not much, but clearly measurable -- about 1.5 % (scrolling benchmark). What can be done for big files that mainly consist of preprocessor definitions?
bug-gnu-emacs <at> gnu.org, bug-cc-mode <at> gnu.org
:bug#25706
; Package emacs,cc-mode
.
(Tue, 01 Dec 2020 14:08:02 GMT) Full text and rfc822 format available.bug-gnu-emacs <at> gnu.org, bug-cc-mode <at> gnu.org
:bug#25706
; Package emacs,cc-mode
.
(Tue, 01 Dec 2020 15:28:02 GMT) Full text and rfc822 format available.Message #71 received at 25706 <at> debbugs.gnu.org (full text, mbox):
From: Alan Mackenzie <acm <at> muc.de> To: Mattias Engdegård <mattiase <at> acm.org> Cc: acm <at> muc.de, Lars Ingebrigtsen <larsi <at> gnus.org>, 25706 <at> debbugs.gnu.org Subject: Re: bug#25706: 26.0.50; Slow C file fontification Date: Tue, 1 Dec 2020 15:27:11 +0000
Hello, Mattias. On Tue, Dec 01, 2020 at 15:07:02 +0100, Mattias Engdegård wrote: > 1 dec. 2020 kl. 13.57 skrev Alan Mackenzie <acm <at> muc.de>: > > Just something I threw together a few years ago, and use regularly on > > xdisp.c to check nothing's gone seriously slow/see how well my latest > > optimisation has worked. > Thank you, good, I just wanted to know that we are measuring the same thing! > > How much time does this regexp change save on a "normal" file, such as > > src/xdisp.c? > Not much, but clearly measurable -- about 1.5 % (scrolling benchmark). Ah. ;-) Do you think the difference might be significantly more if I were systematically to expunge "\\("s from CC Mode? > What can be done for big files that mainly consist of preprocessor > definitions? Add in yet another cache (or fix the existing cache which is buggy) for whatever it is that's searching backwards for braces. The cache would look something like (P . St) meaning P is the position of the highest brace before St. P nil would mean there was no opening brace at all before St. So any backward search for a { starting between P and St could just return P, any search starting after St. would only need to search back to St, and so on. It's rather messy and easy not to get right first time, but it could make a tremendous difference to these crazy include files. I put in a cache like that for macros after somebody complained about the sluggishness in his file (which was basically a single 4,000 line macro). -- Alan Mackenzie (Nuremberg, Germany).
bug-gnu-emacs <at> gnu.org, bug-cc-mode <at> gnu.org
:bug#25706
; Package emacs,cc-mode
.
(Tue, 01 Dec 2020 19:00:02 GMT) Full text and rfc822 format available.Message #74 received at 25706 <at> debbugs.gnu.org (full text, mbox):
From: Mattias Engdegård <mattiase <at> acm.org> To: Alan Mackenzie <acm <at> muc.de> Cc: Lars Ingebrigtsen <larsi <at> gnus.org>, 25706 <at> debbugs.gnu.org Subject: Re: bug#25706: 26.0.50; Slow C file fontification Date: Tue, 1 Dec 2020 19:59:04 +0100
1 dec. 2020 kl. 16.27 skrev Alan Mackenzie <acm <at> muc.de>: > Ah. ;-) Do you think the difference might be significantly more if I > were systematically to expunge "\\("s from CC Mode? No, probably not. It's just obvious low-hanging fruit; every little helps some. Doing so also makes the regexps a little less mystifying for the reader since the only capture groups left are those actually used. Finally, it removes or at least raises some hard limits that we had in the past (from regexp stack overflow). > Add in yet another cache (or fix the existing cache which is buggy) for > whatever it is that's searching backwards for braces. Are the bugs in the existing cache preventing it from making the cases under discussion faster? A naïve question: the files we are talking about are dominated by (mostly single-line) preprocessor directives whose fontification should be invariant of context (as long as they are not inside comments or strings, but that's not hard to find out). Why do we then spend time looking for context at all? From profiling, it seems that about 30 % of the time is spent in c-determine-limit, called from c-fl-decl-start, c-font-lock-enclosing-decls and c-font-lock-cut-off-declarators (about 10 % each).
bug-gnu-emacs <at> gnu.org, bug-cc-mode <at> gnu.org
:bug#25706
; Package emacs,cc-mode
.
(Wed, 02 Dec 2020 10:16:02 GMT) Full text and rfc822 format available.Message #77 received at 25706 <at> debbugs.gnu.org (full text, mbox):
From: Alan Mackenzie <acm <at> muc.de> To: Mattias Engdegård <mattiase <at> acm.org> Cc: Lars Ingebrigtsen <larsi <at> gnus.org>, 25706 <at> debbugs.gnu.org Subject: Re: bug#25706: 26.0.50; Slow C file fontification Date: Wed, 2 Dec 2020 10:15:29 +0000
Hello, Mattias. On Tue, Dec 01, 2020 at 19:59:04 +0100, Mattias Engdegård wrote: > 1 dec. 2020 kl. 16.27 skrev Alan Mackenzie <acm <at> muc.de>: > > Ah. ;-) Do you think the difference might be significantly more if I > > were systematically to expunge "\\("s from CC Mode? > No, probably not. It's just obvious low-hanging fruit; every little > helps some. Doing so also makes the regexps a little less mystifying > for the reader since the only capture groups left are those actually > used. Finally, it removes or at least raises some hard limits that we > had in the past (from regexp stack overflow). OK. That's a project for ASAP, but not, then, urgent. > > Add in yet another cache (or fix the existing cache which is buggy) > > for whatever it is that's searching backwards for braces. > Are the bugs in the existing cache preventing it from making the cases > under discussion faster? I spent yesterday evening investigating the "CC Mode state cache", i.e. the thing that keeps track of braces and open parens/brackets. I found a place where it was unnecessarily causing scanning from BOB, and fixed it provisionally. On doing a (time-scroll) on the entire monster buffer, it saved ~25% of the run time. There is definitely something else scanning repeatedly from BOB - the screen scrolling was more sluggish near the end of the buffer than half way through. Here's that provisional patch, if you'd like to try it: diff -r 863d08a1858a cc-engine.el --- a/cc-engine.el Thu Nov 26 11:27:52 2020 +0000 +++ b/cc-engine.el Wed Dec 02 09:55:50 2020 +0000 @@ -3672,9 +3672,9 @@ how-far 0)) ((<= good-pos here) (setq strategy 'forward - start-point (if changed-macro-start - cache-pos - (max good-pos cache-pos)) + start-point ;; (if changed-macro-start OLD STOUGH, 2020-12-01 + ;; cache-pos + (max good-pos cache-pos);; ) how-far (- here start-point))) ((< (- good-pos here) (- here cache-pos)) ; FIXME!!! ; apply some sort of weighting. (setq strategy 'backward > A naïve question: the files we are talking about are dominated by > (mostly single-line) preprocessor directives whose fontification should > be invariant of context (as long as they are not inside comments or > strings, but that's not hard to find out). Why do we then spend time > looking for context at all? Because many situations are context dependent, particularly in C++ Mode. That raises the possibility of not tracking context for these monster files.h, but how would one distinguish between these different "types" of CC Mode file? > From profiling, it seems that about 30 % of the time is spent in > c-determine-limit, called from c-fl-decl-start, > c-font-lock-enclosing-decls and c-font-lock-cut-off-declarators (about > 10 % each). Yes. c-determine-limit scans backwards over a buffer to find a position that is around N non-string non-comment characters before point. I put some instrumentation on it yesterday evening, and it is apparent that it is getting called four times in succession from the same point with N = 500, 1000, 1000, 1000. This screams out for a simple cache, which I intend to implement. Also, maybe I should always call c-determine-limit with the same N, and perhaps even cut N to 500 in all cases. Or something like that. It is clear that a great deal of run time could be saved, here. Also, I intend to track down whatever the other thing is that is scanning from the previous brace or BOB. It may be possible to alter the handling of these monster files from impossibly slow to somewhat sluggish. -- Alan Mackenzie (Nuremberg, Germany).
bug-gnu-emacs <at> gnu.org, bug-cc-mode <at> gnu.org
:bug#25706
; Package emacs,cc-mode
.
(Wed, 02 Dec 2020 15:07:02 GMT) Full text and rfc822 format available.Message #80 received at 25706 <at> debbugs.gnu.org (full text, mbox):
From: Mattias Engdegård <mattiase <at> acm.org> To: Alan Mackenzie <acm <at> muc.de> Cc: Lars Ingebrigtsen <larsi <at> gnus.org>, 25706 <at> debbugs.gnu.org Subject: Re: bug#25706: 26.0.50; Slow C file fontification Date: Wed, 2 Dec 2020 16:06:43 +0100
2 dec. 2020 kl. 11.15 skrev Alan Mackenzie <acm <at> muc.de>: > I spent yesterday evening investigating the "CC Mode state cache", i.e. > the thing that keeps track of braces and open parens/brackets. I found a > place where it was unnecessarily causing scanning from BOB, and fixed it > provisionally. On doing a (time-scroll) on the entire monster buffer, it > saved ~25% of the run time. There is definitely something else scanning > repeatedly from BOB - the screen scrolling was more sluggish near the end > of the buffer than half way through. > > Here's that provisional patch, if you'd like to try it: Thanks, it does indeed speed things up in various synthetic tests as well. You are right that there still seems to be at least a quadratic term left. > Because many situations are context dependent, particularly in C++ Mode. > That raises the possibility of not tracking context for these monster > files.h, but how would one distinguish between these different "types" of > CC Mode file? Please bear with my lack of understanding of how this works, but what I meant is that a preprocessor line neither affects nor is affected by the context, so until something other than such lines (and comments) are found in the region being fontified, there should be no need to determine the context in the first place. > I put some instrumentation on it yesterday evening, and it is apparent > that it is getting called four times in succession from the same point > with N = 500, 1000, 1000, 1000. This screams out for a simple cache, > which I intend to implement. Also, maybe I should always call > c-determine-limit with the same N, and perhaps even cut N to 500 in all > cases. Or something like that. It is clear that a great deal of run > time could be saved, here. > > Also, I intend to track down whatever the other thing is that is scanning > from the previous brace or BOB. It may be possible to alter the handling > of these monster files from impossibly slow to somewhat sluggish. There is optimism then! Some of the files from the Linux tree mentioned by Ravine Var are also good to try, such as https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/plain/drivers/gpu/drm/amd/include/asic_reg/bif/bif_5_1_sh_mask.h
bug-gnu-emacs <at> gnu.org, bug-cc-mode <at> gnu.org
:bug#25706
; Package emacs,cc-mode
.
(Thu, 03 Dec 2020 10:49:02 GMT) Full text and rfc822 format available.Message #83 received at 25706 <at> debbugs.gnu.org (full text, mbox):
From: Alan Mackenzie <acm <at> muc.de> To: Mattias Engdegård <mattiase <at> acm.org> Cc: Lars Ingebrigtsen <larsi <at> gnus.org>, 25706 <at> debbugs.gnu.org Subject: Re: bug#25706: 26.0.50; Slow C file fontification Date: Thu, 3 Dec 2020 10:48:23 +0000
Hello, Mattias. On Wed, Dec 02, 2020 at 16:06:43 +0100, Mattias Engdegård wrote: > 2 dec. 2020 kl. 11.15 skrev Alan Mackenzie <acm <at> muc.de>: > > I spent yesterday evening investigating the "CC Mode state cache", i.e. > > the thing that keeps track of braces and open parens/brackets. I found a > > place where it was unnecessarily causing scanning from BOB, and fixed it > > provisionally. On doing a (time-scroll) on the entire monster buffer, it > > saved ~25% of the run time. There is definitely something else scanning > > repeatedly from BOB - the screen scrolling was more sluggish near the end > > of the buffer than half way through. I've found it. There was a "harmless" c-backward-syntactic-ws invocation in c-determine-limit. This macro moves back over syntactic whitespace, which includes macros. So this was going back all the way to BOB, from which we scanned forward again. In the enclosed patch (which includes my previous amendment) I've removed this. There are many other places which invoke c-backward-syntactic-ws without giving the limit argument, and these slow down CC Mode too, though not as dramatically as the removed one. I have given limits arguments to two of these in c-font-complex-decl-prepare, which reduce the (time-scroll) time for the last 10% of the entire monster file from ~77s to ~44s. I intend to instrument c-backward-sws to determine which of the other invocations of c-backward-syntactic-ws are most time consuming. There are around 90 such calls in CC Mode. :-( It now takes me just under 6 minutes to (time-scroll) through the entire buffer, compared with a previous hour. As already mentioned, it is still slightly more sluggish near the end of the buffer than near the start. > > Here's that provisional patch, if you'd like to try it: So, here's another provisional patch: diff -r 863d08a1858a cc-engine.el --- a/cc-engine.el Thu Nov 26 11:27:52 2020 +0000 +++ b/cc-engine.el Thu Dec 03 10:43:45 2020 +0000 @@ -3672,9 +3672,7 @@ how-far 0)) ((<= good-pos here) (setq strategy 'forward - start-point (if changed-macro-start - cache-pos - (max good-pos cache-pos)) + start-point (max good-pos cache-pos) how-far (- here start-point))) ((< (- good-pos here) (- here cache-pos)) ; FIXME!!! ; apply some sort of weighting. (setq strategy 'backward @@ -5778,8 +5776,6 @@ ;; Get a "safe place" approximately TRY-SIZE characters before START. ;; This defsubst doesn't preserve point. (goto-char start) - (c-backward-syntactic-ws) - (setq start (point)) (let* ((pos (max (- start try-size) (point-min))) (s (c-semi-pp-to-literal pos)) (cand (or (car (cddr s)) pos))) diff -r 863d08a1858a cc-fonts.el --- a/cc-fonts.el Thu Nov 26 11:27:52 2020 +0000 +++ b/cc-fonts.el Thu Dec 03 10:43:45 2020 +0000 @@ -948,7 +948,7 @@ ;; closest token before the region. (save-excursion (let ((pos (point))) - (c-backward-syntactic-ws) + (c-backward-syntactic-ws (max (- (point) 500) (point-min))) (c-clear-char-properties (if (and (not (bobp)) (memq (c-get-char-property (1- (point)) 'c-type) @@ -970,7 +970,7 @@ ;; The declared identifiers are font-locked correctly as types, if ;; that is what they are. (let ((prop (save-excursion - (c-backward-syntactic-ws) + (c-backward-syntactic-ws (max (- (point) 500) (point-min))) (unless (bobp) (c-get-char-property (1- (point)) 'c-type))))) (when (memq prop '(c-decl-id-start c-decl-type-start)) [ .... ] -- Alan Mackenzie (Nuremberg, Germany).
bug-gnu-emacs <at> gnu.org, bug-cc-mode <at> gnu.org
:bug#25706
; Package emacs,cc-mode
.
(Thu, 03 Dec 2020 14:04:02 GMT) Full text and rfc822 format available.Message #86 received at 25706 <at> debbugs.gnu.org (full text, mbox):
From: Mattias Engdegård <mattiase <at> acm.org> To: Alan Mackenzie <acm <at> muc.de> Cc: Lars Ingebrigtsen <larsi <at> gnus.org>, 25706 <at> debbugs.gnu.org Subject: Re: bug#25706: 26.0.50; Slow C file fontification Date: Thu, 3 Dec 2020 15:03:27 +0100
3 dec. 2020 kl. 11.48 skrev Alan Mackenzie <acm <at> muc.de>: > I've found it. There was a "harmless" c-backward-syntactic-ws invocation > in c-determine-limit. This macro moves back over syntactic whitespace, > which includes macros. So this was going back all the way to BOB, from > which we scanned forward again. Not bad. Now Emacs starts becoming usable for real code! I can confirm a big subjective improvement on several big preprocessor-heavy files, and measurements agree. > It now takes me just under 6 minutes to (time-scroll) through the entire > buffer, compared with a previous hour. As already mentioned, it is still > slightly more sluggish near the end of the buffer than near the start. Is that with or without my regexp patch? It looks like there may be more regexp improvements possible. We can take a closer look later on, when the running time is less dominated by other issues.
bug-gnu-emacs <at> gnu.org, bug-cc-mode <at> gnu.org
:bug#25706
; Package emacs,cc-mode
.
(Fri, 04 Dec 2020 21:06:02 GMT) Full text and rfc822 format available.Message #89 received at 25706 <at> debbugs.gnu.org (full text, mbox):
From: Alan Mackenzie <acm <at> muc.de> To: Mattias Engdegård <mattiase <at> acm.org> Cc: Lars Ingebrigtsen <larsi <at> gnus.org>, 25706 <at> debbugs.gnu.org Subject: Re: bug#25706: 26.0.50; Slow C file fontification Date: Fri, 4 Dec 2020 21:04:50 +0000
Hello, Mattias. On Thu, Dec 03, 2020 at 15:03:27 +0100, Mattias Engdegård wrote: > 3 dec. 2020 kl. 11.48 skrev Alan Mackenzie <acm <at> muc.de>: > > I've found it. There was a "harmless" c-backward-syntactic-ws > > invocation in c-determine-limit. This macro moves back over > > syntactic whitespace, which includes macros. So this was going back > > all the way to BOB, from which we scanned forward again. > Not bad. Now Emacs starts becoming usable for real code! I can confirm > a big subjective improvement on several big preprocessor-heavy files, > and measurements agree. I think you'll like my latest provisional patch! I've tracked down and eliminated a ~0.5s delay when typing characters into a "monster" buffer near the end. > > It now takes me just under 6 minutes to (time-scroll) through the entire > > buffer, compared with a previous hour. As already mentioned, it is still > > slightly more sluggish near the end of the buffer than near the start. With the latest patch, it takes me 121s. > Is that with or without my regexp patch? Without. > It looks like there may be more regexp improvements possible. We can > take a closer look later on, when the running time is less dominated by > other issues. Maybe that time is now. Please try the latest patch. I think there are still things needing optimisation in C++ Mode (make sure your monster buffers are in C Mode, please). But for now.... diff --git a/lisp/progmodes/cc-engine.el b/lisp/progmodes/cc-engine.el index 252eec138c..22e6ef5894 100644 --- a/lisp/progmodes/cc-engine.el +++ b/lisp/progmodes/cc-engine.el @@ -972,7 +972,7 @@ c-beginning-of-statement-1 ;; that we've moved. (while (progn (setq pos (point)) - (c-backward-syntactic-ws) + (c-backward-syntactic-ws lim) ;; Protect post-++/-- operators just before a virtual semicolon. (and (not (c-at-vsemi-p)) (/= (skip-chars-backward "-+!*&~@`#") 0)))) @@ -984,7 +984,7 @@ c-beginning-of-statement-1 (if (and (memq (char-before) delims) (progn (forward-char -1) (setq saved (point)) - (c-backward-syntactic-ws) + (c-backward-syntactic-ws lim) (or (memq (char-before) delims) (memq (char-before) '(?: nil)) (eq (char-syntax (char-before)) ?\() @@ -1164,7 +1164,7 @@ c-beginning-of-statement-1 ;; HERE IS THE SINGLE PLACE INSIDE THE PDA LOOP WHERE WE MOVE ;; BACKWARDS THROUGH THE SOURCE. - (c-backward-syntactic-ws) + (c-backward-syntactic-ws lim) (let ((before-sws-pos (point)) ;; The end position of the area to search for statement ;; barriers in this round. @@ -1188,7 +1188,7 @@ c-beginning-of-statement-1 ((and (not macro-start) (c-beginning-of-macro)) (save-excursion - (c-backward-syntactic-ws) + (c-backward-syntactic-ws lim) (setq before-sws-pos (point))) ;; Have we crossed a statement boundary? If not, ;; keep going back until we find one or a "real" sexp. @@ -1413,7 +1413,7 @@ c-beginning-of-statement-1 ;; Skip over the unary operators that can start the statement. (while (progn - (c-backward-syntactic-ws) + (c-backward-syntactic-ws lim) ;; protect AWK post-inc/decrement operators, etc. (and (not (c-at-vsemi-p (point))) (/= (skip-chars-backward "-.+!*&~@`#") 0))) @@ -3568,15 +3568,18 @@ c-get-fallback-scan-pos ;; Return a start position for building `c-state-cache' from ;; scratch. This will be at the top level, 2 defuns back. (save-excursion - ;; Go back 2 bods, but ignore any bogus positions returned by - ;; beginning-of-defun (i.e. open paren in column zero). - (goto-char here) - (let ((cnt 2)) - (while (not (or (bobp) (zerop cnt))) - (c-beginning-of-defun-1) ; Pure elisp BOD. - (if (eq (char-after) ?\{) - (setq cnt (1- cnt))))) - (point))) + (save-restriction + (when (> here (* 10 c-state-cache-too-far)) + (narrow-to-region (- here (* 10 c-state-cache-too-far)) here)) + ;; Go back 2 bods, but ignore any bogus positions returned by + ;; beginning-of-defun (i.e. open paren in column zero). + (goto-char here) + (let ((cnt 2)) + (while (not (or (bobp) (zerop cnt))) + (c-beginning-of-defun-1) ; Pure elisp BOD. + (if (eq (char-after) ?\{) + (setq cnt (1- cnt))))) + (point)))) (defun c-state-balance-parens-backwards (here- here+ top) ;; Return the position of the opening paren/brace/bracket before HERE- which @@ -3667,9 +3670,7 @@ c-parse-state-get-strategy how-far 0)) ((<= good-pos here) (setq strategy 'forward - start-point (if changed-macro-start - cache-pos - (max good-pos cache-pos)) + start-point (max good-pos cache-pos) how-far (- here start-point))) ((< (- good-pos here) (- here cache-pos)) ; FIXME!!! ; apply some sort of weighting. (setq strategy 'backward @@ -4337,8 +4338,12 @@ c-invalidate-state-cache-1 (if (and dropped-cons (<= too-high-pa here)) (c-append-lower-brace-pair-to-state-cache too-high-pa here here-bol)) - (setq c-state-cache-good-pos (or (c-state-cache-after-top-paren) - (c-state-get-min-scan-pos))))) + (if (and c-state-cache-good-pos (< here c-state-cache-good-pos)) + (setq c-state-cache-good-pos + (or (save-excursion + (goto-char here) + (c-literal-start)) + here))))) ;; The brace-pair desert marker: (when (car c-state-brace-pair-desert) @@ -5402,8 +5407,11 @@ c-syntactic-skip-backward ;; Optimize for, in particular, large blocks of comments from ;; `comment-region'. (progn (when opt-ws - (c-backward-syntactic-ws) - (setq paren-level-pos (point))) + (let ((opt-pos (point))) + (c-backward-syntactic-ws limit) + (if (> (point) limit) + (setq paren-level-pos (point)) + (goto-char opt-pos)))) t) ;; Move back to a candidate end point which isn't in a literal ;; or in a macro we didn't start in. @@ -5423,7 +5431,10 @@ c-syntactic-skip-backward (setq macro-start (point)))) (goto-char macro-start)))) (when opt-ws - (c-backward-syntactic-ws))) + (let ((opt-pos (point))) + (c-backward-syntactic-ws limit) + (if (<= (point) limit) + (goto-char opt-pos))))) (< (point) pos)) ;; Check whether we're at the wrong level of nesting (when @@ -5766,8 +5777,6 @@ c-determine-limit-get-base ;; Get a "safe place" approximately TRY-SIZE characters before START. ;; This defsubst doesn't preserve point. (goto-char start) - (c-backward-syntactic-ws) - (setq start (point)) (let* ((pos (max (- start try-size) (point-min))) (s (c-semi-pp-to-literal pos)) (cand (or (car (cddr s)) pos))) @@ -6248,8 +6257,13 @@ c-find-decl-prefix-search ;; preceding syntactic ws to set `cfd-match-pos' and to catch ;; any decl spots in the syntactic ws. (unless cfd-re-match - (c-backward-syntactic-ws) - (setq cfd-re-match (point)))) + (let ((cfd-cbsw-lim (- (point) 1000))) + (c-backward-syntactic-ws cfd-cbsw-lim) + (setq cfd-re-match + (if (> (point) cfd-cbsw-lim) + (point) + 0))) ; Set BOB case if the token's too far back. + )) ;; Choose whichever match is closer to the start. (if (< cfd-re-match cfd-prop-match) @@ -6482,7 +6496,10 @@ c-find-decl-spots (c-invalidate-find-decl-cache cfd-start-pos) (setq syntactic-pos (point)) - (unless (eq syntactic-pos c-find-decl-syntactic-pos) + (unless + (or (eq syntactic-pos c-find-decl-syntactic-pos) + (null c-find-decl-syntactic-pos) + (< c-find-decl-syntactic-pos (- (point) 10000))) ;; Don't have to do this if the cache is relevant here, ;; typically if the same line is refontified again. If ;; we're just some syntactic whitespace further down we can diff --git a/lisp/progmodes/cc-fonts.el b/lisp/progmodes/cc-fonts.el index bb7e5bea6e..07dcefb8d1 100644 --- a/lisp/progmodes/cc-fonts.el +++ b/lisp/progmodes/cc-fonts.el @@ -947,7 +947,7 @@ c-font-lock-complex-decl-prepare ;; closest token before the region. (save-excursion (let ((pos (point))) - (c-backward-syntactic-ws) + (c-backward-syntactic-ws (max (- (point) 500) (point-min))) (c-clear-char-properties (if (and (not (bobp)) (memq (c-get-char-property (1- (point)) 'c-type) @@ -969,7 +969,7 @@ c-font-lock-complex-decl-prepare ;; The declared identifiers are font-locked correctly as types, if ;; that is what they are. (let ((prop (save-excursion - (c-backward-syntactic-ws) + (c-backward-syntactic-ws (max (- (point) 500) (point-min))) (unless (bobp) (c-get-char-property (1- (point)) 'c-type))))) (when (memq prop '(c-decl-id-start c-decl-type-start)) @@ -1496,7 +1496,8 @@ c-font-lock-declarations ;; Check we haven't missed a preceding "typedef". (when (not (looking-at c-typedef-key)) - (c-backward-syntactic-ws) + (c-backward-syntactic-ws + (max (- (point) 1000) (point-min))) (c-backward-token-2) (or (looking-at c-typedef-key) (goto-char start-pos))) @@ -1536,8 +1537,10 @@ c-font-lock-declarations (c-backward-token-2) (and (not (looking-at c-opt-<>-sexp-key)) - (progn (c-backward-syntactic-ws) - (memq (char-before) '(?\( ?,))) + (progn + (c-backward-syntactic-ws + (max (- (point) 1000) (point-min))) + (memq (char-before) '(?\( ?,))) (not (eq (c-get-char-property (1- (point)) 'c-type) 'c-decl-arg-start)))))) @@ -2295,7 +2298,8 @@ c-font-lock-c++-using (and c-colon-type-list-re (c-go-up-list-backward) (eq (char-after) ?{) - (eq (car (c-beginning-of-decl-1)) 'same) + (eq (car (c-beginning-of-decl-1 + (c-determine-limit 1000))) 'same) (looking-at c-colon-type-list-re))) ;; Inherited protected member: leave unfontified ) -- Alan Mackenzie (Nuremberg, Germany).
bug-gnu-emacs <at> gnu.org, bug-cc-mode <at> gnu.org
:bug#25706
; Package emacs,cc-mode
.
(Sat, 05 Dec 2020 15:22:02 GMT) Full text and rfc822 format available.Message #92 received at 25706 <at> debbugs.gnu.org (full text, mbox):
From: Mattias Engdegård <mattiase <at> acm.org> To: Alan Mackenzie <acm <at> muc.de> Cc: Lars Ingebrigtsen <larsi <at> gnus.org>, 25706 <at> debbugs.gnu.org Subject: Re: bug#25706: 26.0.50; Slow C file fontification Date: Sat, 5 Dec 2020 16:20:54 +0100
4 dec. 2020 kl. 22.04 skrev Alan Mackenzie <acm <at> muc.de>: > I think you'll like my latest provisional patch! > > I've tracked down and eliminated a ~0.5s delay when typing characters > into a "monster" buffer near the end. That's nice, thank you! It seems to be about 19 % faster than the previous patch on this particular file, which is not bad at all. Somehow, the delay when inserting a newline (pressing return) at line 83610 of osprey_reg_map_macro.h becomes longer with the patch. Of course this is more than compensated by the speed-up in general, but it may be worth taking a look at. There is also a new and noticeable delay (0.5-1 s) in the very beginning when scrolling through the file. (This is with the frame sized to show 41 lines of 80 chars of a window, excluding mode line and echo area.)
bug-gnu-emacs <at> gnu.org, bug-cc-mode <at> gnu.org
:bug#25706
; Package emacs,cc-mode
.
(Tue, 08 Dec 2020 18:43:02 GMT) Full text and rfc822 format available.Message #95 received at 25706 <at> debbugs.gnu.org (full text, mbox):
From: Alan Mackenzie <acm <at> muc.de> To: Mattias Engdegård <mattiase <at> acm.org> Cc: Lars Ingebrigtsen <larsi <at> gnus.org>, 25706 <at> debbugs.gnu.org Subject: Re: bug#25706: 26.0.50; Slow C file fontification Date: Tue, 8 Dec 2020 18:42:35 +0000
Hello again, Mattias. On Sat, Dec 05, 2020 at 16:20:54 +0100, Mattias Engdegård wrote: > 4 dec. 2020 kl. 22.04 skrev Alan Mackenzie <acm <at> muc.de>: [ .... ] > That's nice, thank you! It seems to be about 19 % faster than the > previous patch on this particular file, which is not bad at all. Well, the enclosed patch improves on this a little, particularly in C++ Mode. (Trying the monster file.h in C++ Mode is now something worth trying). Just as a matter of interest, I've done a fair bit of testing with a larger monster file (~14 MB) in the Linux kernel, at linux/drivers/gpu/drm/amd/include/asic_reg/nbio/nbio_6_1_sh_mask.h . That's 133,000 lines, give or take. Even our largest file, src/xdisp.c is only 36,000 lines. I don't understand how a file describing hardware can come to anything like 133k lines. It must be soul destroying to have to write a driver based on a file like this. That file was put together by AMD, and I suspect they didn't take all that much care to make it usable. > Somehow, the delay when inserting a newline (pressing return) at line > 83610 of osprey_reg_map_macro.h becomes longer with the patch. I think I've fixed this. Thanks for prompting me. > Of course this is more than compensated by the speed-up in general, > but it may be worth taking a look at. There's one thing which still puzzles me. In osprey_reg....h, when scrolling through it (e.g. with (time-scroll)), it stutters markedly at around 13% of the way through. I've managed to localize this, it's happening in the macro c-find-decl-prefix-search (invoked only from c-find-decl-spots), and has something to do with the call to re-search-forward there, but I've not manage to pin down exactly what the cause is. > There is also a new and noticeable delay (0.5-1 s) in the very > beginning when scrolling through the file. (This is with the frame > sized to show 41 lines of 80 chars of a window, excluding mode line > and echo area.) This seems still to be there. I'll admit, I haven't really looked at this yet. Anyhow, please try out the (?)final version of my patch before I commit it and close the bug. It should apply cleanly to the master branch. I might well split it into three changes, two small, one large, since there are, in a sense three distinct fixes there. Thanks! diff --git a/lisp/progmodes/cc-engine.el b/lisp/progmodes/cc-engine.el index 252eec138c..2365085036 100644 --- a/lisp/progmodes/cc-engine.el +++ b/lisp/progmodes/cc-engine.el @@ -972,7 +972,7 @@ c-beginning-of-statement-1 ;; that we've moved. (while (progn (setq pos (point)) - (c-backward-syntactic-ws) + (c-backward-syntactic-ws lim) ;; Protect post-++/-- operators just before a virtual semicolon. (and (not (c-at-vsemi-p)) (/= (skip-chars-backward "-+!*&~@`#") 0)))) @@ -984,7 +984,7 @@ c-beginning-of-statement-1 (if (and (memq (char-before) delims) (progn (forward-char -1) (setq saved (point)) - (c-backward-syntactic-ws) + (c-backward-syntactic-ws lim) (or (memq (char-before) delims) (memq (char-before) '(?: nil)) (eq (char-syntax (char-before)) ?\() @@ -1164,7 +1164,7 @@ c-beginning-of-statement-1 ;; HERE IS THE SINGLE PLACE INSIDE THE PDA LOOP WHERE WE MOVE ;; BACKWARDS THROUGH THE SOURCE. - (c-backward-syntactic-ws) + (c-backward-syntactic-ws lim) (let ((before-sws-pos (point)) ;; The end position of the area to search for statement ;; barriers in this round. @@ -1174,33 +1174,35 @@ c-beginning-of-statement-1 ;; Go back over exactly one logical sexp, taking proper ;; account of macros and escaped EOLs. (while - (progn - (setq comma-delimited (and (not comma-delim) - (eq (char-before) ?\,))) - (unless (c-safe (c-backward-sexp) t) - ;; Give up if we hit an unbalanced block. Since the - ;; stack won't be empty the code below will report a - ;; suitable error. - (setq pre-stmt-found t) - (throw 'loop nil)) - (cond - ;; Have we moved into a macro? - ((and (not macro-start) - (c-beginning-of-macro)) - (save-excursion - (c-backward-syntactic-ws) - (setq before-sws-pos (point))) - ;; Have we crossed a statement boundary? If not, - ;; keep going back until we find one or a "real" sexp. - (and + (and + (progn + (setq comma-delimited (and (not comma-delim) + (eq (char-before) ?\,))) + (unless (c-safe (c-backward-sexp) t) + ;; Give up if we hit an unbalanced block. Since the + ;; stack won't be empty the code below will report a + ;; suitable error. + (setq pre-stmt-found t) + (throw 'loop nil)) + (cond + ;; Have we moved into a macro? + ((and (not macro-start) + (c-beginning-of-macro)) (save-excursion - (c-end-of-macro) - (not (c-crosses-statement-barrier-p - (point) maybe-after-boundary-pos))) - (setq maybe-after-boundary-pos (point)))) - ;; Have we just gone back over an escaped NL? This - ;; doesn't count as a sexp. - ((looking-at "\\\\$"))))) + (c-backward-syntactic-ws lim) + (setq before-sws-pos (point))) + ;; Have we crossed a statement boundary? If not, + ;; keep going back until we find one or a "real" sexp. + (and + (save-excursion + (c-end-of-macro) + (not (c-crosses-statement-barrier-p + (point) maybe-after-boundary-pos))) + (setq maybe-after-boundary-pos (point)))) + ;; Have we just gone back over an escaped NL? This + ;; doesn't count as a sexp. + ((looking-at "\\\\$")))) + (>= (point) lim))) ;; Have we crossed a statement boundary? (setq boundary-pos @@ -1413,7 +1415,7 @@ c-beginning-of-statement-1 ;; Skip over the unary operators that can start the statement. (while (progn - (c-backward-syntactic-ws) + (c-backward-syntactic-ws lim) ;; protect AWK post-inc/decrement operators, etc. (and (not (c-at-vsemi-p (point))) (/= (skip-chars-backward "-.+!*&~@`#") 0))) @@ -3568,15 +3570,18 @@ c-get-fallback-scan-pos ;; Return a start position for building `c-state-cache' from ;; scratch. This will be at the top level, 2 defuns back. (save-excursion - ;; Go back 2 bods, but ignore any bogus positions returned by - ;; beginning-of-defun (i.e. open paren in column zero). - (goto-char here) - (let ((cnt 2)) - (while (not (or (bobp) (zerop cnt))) - (c-beginning-of-defun-1) ; Pure elisp BOD. - (if (eq (char-after) ?\{) - (setq cnt (1- cnt))))) - (point))) + (save-restriction + (when (> here (* 10 c-state-cache-too-far)) + (narrow-to-region (- here (* 10 c-state-cache-too-far)) here)) + ;; Go back 2 bods, but ignore any bogus positions returned by + ;; beginning-of-defun (i.e. open paren in column zero). + (goto-char here) + (let ((cnt 2)) + (while (not (or (bobp) (zerop cnt))) + (c-beginning-of-defun-1) ; Pure elisp BOD. + (if (eq (char-after) ?\{) + (setq cnt (1- cnt))))) + (point)))) (defun c-state-balance-parens-backwards (here- here+ top) ;; Return the position of the opening paren/brace/bracket before HERE- which @@ -3667,9 +3672,7 @@ c-parse-state-get-strategy how-far 0)) ((<= good-pos here) (setq strategy 'forward - start-point (if changed-macro-start - cache-pos - (max good-pos cache-pos)) + start-point (max good-pos cache-pos) how-far (- here start-point))) ((< (- good-pos here) (- here cache-pos)) ; FIXME!!! ; apply some sort of weighting. (setq strategy 'backward @@ -4337,8 +4340,12 @@ c-invalidate-state-cache-1 (if (and dropped-cons (<= too-high-pa here)) (c-append-lower-brace-pair-to-state-cache too-high-pa here here-bol)) - (setq c-state-cache-good-pos (or (c-state-cache-after-top-paren) - (c-state-get-min-scan-pos))))) + (if (and c-state-cache-good-pos (< here c-state-cache-good-pos)) + (setq c-state-cache-good-pos + (or (save-excursion + (goto-char here) + (c-literal-start)) + here))))) ;; The brace-pair desert marker: (when (car c-state-brace-pair-desert) @@ -4796,7 +4803,7 @@ c-on-identifier ;; Handle the "operator +" syntax in C++. (when (and c-overloadable-operators-regexp - (= (c-backward-token-2 0) 0)) + (= (c-backward-token-2 0 nil (c-determine-limit 500)) 0)) (cond ((and (looking-at c-overloadable-operators-regexp) (or (not c-opt-op-identifier-prefix) @@ -5065,7 +5072,8 @@ c-backward-token-2 (while (and (> count 0) (progn - (c-backward-syntactic-ws) + (c-backward-syntactic-ws + limit) (backward-char) (if (looking-at jump-syntax) (goto-char (scan-sexps (1+ (point)) -1)) @@ -5402,8 +5410,12 @@ c-syntactic-skip-backward ;; Optimize for, in particular, large blocks of comments from ;; `comment-region'. (progn (when opt-ws - (c-backward-syntactic-ws) - (setq paren-level-pos (point))) + (let ((opt-pos (point))) + (c-backward-syntactic-ws limit) + (if (or (null limit) + (> (point) limit)) + (setq paren-level-pos (point)) + (goto-char opt-pos)))) t) ;; Move back to a candidate end point which isn't in a literal ;; or in a macro we didn't start in. @@ -5423,7 +5435,11 @@ c-syntactic-skip-backward (setq macro-start (point)))) (goto-char macro-start)))) (when opt-ws - (c-backward-syntactic-ws))) + (let ((opt-pos (point))) + (c-backward-syntactic-ws limit) + (if (and limit + (<= (point) limit)) + (goto-char opt-pos))))) (< (point) pos)) ;; Check whether we're at the wrong level of nesting (when @@ -5474,7 +5490,7 @@ c-syntactic-skip-backward (progn ;; Skip syntactic ws afterwards so that we don't stop at the ;; end of a comment if `skip-chars' is something like "^/". - (c-backward-syntactic-ws) + (c-backward-syntactic-ws limit) (point))))) ;; We might want to extend this with more useful return values in @@ -5762,12 +5778,23 @@ c-literal-type (t 'c))) ; Assuming the range is valid. range)) +(defun c-determine-limit-no-macro (here org-start) + ;; If HERE is inside a macro, and ORG-START is not also in the same macro, + ;; return the beginning of the macro. Otherwise return HERE. Point is not + ;; preserved by this function. + (goto-char here) + (let ((here-BOM (and (c-beginning-of-macro) (point)))) + (if (and here-BOM + (not (eq (progn (goto-char org-start) + (and (c-beginning-of-macro) (point))) + here-BOM))) + here-BOM + here))) + (defsubst c-determine-limit-get-base (start try-size) ;; Get a "safe place" approximately TRY-SIZE characters before START. ;; This defsubst doesn't preserve point. (goto-char start) - (c-backward-syntactic-ws) - (setq start (point)) (let* ((pos (max (- start try-size) (point-min))) (s (c-semi-pp-to-literal pos)) (cand (or (car (cddr s)) pos))) @@ -5776,20 +5803,23 @@ c-determine-limit-get-base (parse-partial-sexp pos start nil nil (car s) 'syntax-table) (point)))) -(defun c-determine-limit (how-far-back &optional start try-size) +(defun c-determine-limit (how-far-back &optional start try-size org-start) ;; Return a buffer position approximately HOW-FAR-BACK non-literal ;; characters from START (default point). The starting position, either ;; point or START may not be in a comment or string. ;; ;; The position found will not be before POINT-MIN and won't be in a - ;; literal. + ;; literal. It will also not be inside a macro, unless START/point is also + ;; in the same macro. ;; ;; We start searching for the sought position TRY-SIZE (default ;; twice HOW-FAR-BACK) bytes back from START. ;; ;; This function must be fast. :-) + (save-excursion (let* ((start (or start (point))) + (org-start (or org-start start)) (try-size (or try-size (* 2 how-far-back))) (base (c-determine-limit-get-base start try-size)) (pos base) @@ -5842,21 +5872,27 @@ c-determine-limit (setq elt (car stack) stack (cdr stack)) (setq count (+ count (cdr elt)))) - - ;; Have we found enough yet? (cond ((null elt) ; No non-literal characters found. - (if (> base (point-min)) - (c-determine-limit how-far-back base (* 2 try-size)) - (point-min))) + (cond + ((> pos start) ; Nothing but literals + base) + ((> base (point-min)) + (c-determine-limit how-far-back base (* 2 try-size) org-start)) + (t base))) ((>= count how-far-back) - (+ (car elt) (- count how-far-back))) + (c-determine-limit-no-macro + (+ (car elt) (- count how-far-back)) + org-start)) ((eq base (point-min)) (point-min)) ((> base (- start try-size)) ; Can only happen if we hit point-min. - (car elt)) + (c-determine-limit-no-macro + (car elt) + org-start)) (t - (c-determine-limit (- how-far-back count) base (* 2 try-size))))))) + (c-determine-limit (- how-far-back count) base (* 2 try-size) + org-start)))))) (defun c-determine-+ve-limit (how-far &optional start-pos) ;; Return a buffer position about HOW-FAR non-literal characters forward @@ -6153,7 +6189,8 @@ c-bs-at-toplevel-p (or (null stack) ; Probably unnecessary. (<= (cadr stack) 1)))) -(defmacro c-find-decl-prefix-search () +(defmacro + c-find-decl-prefix-search () ;; Macro used inside `c-find-decl-spots'. It ought to be a defun, ;; but it contains lots of free variables that refer to things ;; inside `c-find-decl-spots'. The point is left at `cfd-match-pos' @@ -6248,8 +6285,14 @@ c-find-decl-prefix-search ;; preceding syntactic ws to set `cfd-match-pos' and to catch ;; any decl spots in the syntactic ws. (unless cfd-re-match - (c-backward-syntactic-ws) - (setq cfd-re-match (point)))) + (let ((cfd-cbsw-lim + (max (- (point) 1000) (point-min)))) + (c-backward-syntactic-ws cfd-cbsw-lim) + (setq cfd-re-match + (if (or (bobp) (> (point) cfd-cbsw-lim)) + (point) + (point-min)))) ; Set BOB case if the token's too far back. + )) ;; Choose whichever match is closer to the start. (if (< cfd-re-match cfd-prop-match) @@ -6410,7 +6453,7 @@ c-find-decl-spots (while (and (not (bobp)) (c-got-face-at (1- (point)) c-literal-faces)) (goto-char (previous-single-property-change - (point) 'face nil (point-min)))) + (point) 'face nil (point-min)))) ; No limit. FIXME, perhaps? 2020-12-07. ;; XEmacs doesn't fontify the quotes surrounding string ;; literals. @@ -6482,12 +6525,15 @@ c-find-decl-spots (c-invalidate-find-decl-cache cfd-start-pos) (setq syntactic-pos (point)) - (unless (eq syntactic-pos c-find-decl-syntactic-pos) + (unless + (eq syntactic-pos c-find-decl-syntactic-pos) ;; Don't have to do this if the cache is relevant here, ;; typically if the same line is refontified again. If ;; we're just some syntactic whitespace further down we can ;; still use the cache to limit the skipping. - (c-backward-syntactic-ws c-find-decl-syntactic-pos)) + (c-backward-syntactic-ws + (max (or c-find-decl-syntactic-pos (point-min)) + (- (point) 10000) (point-min)))) ;; If we hit `c-find-decl-syntactic-pos' and ;; `c-find-decl-match-pos' is set then we install the cached @@ -6613,7 +6659,8 @@ c-find-decl-spots ;; syntactic ws. (when (and cfd-match-pos (< cfd-match-pos syntactic-pos)) (goto-char syntactic-pos) - (c-forward-syntactic-ws) + (c-forward-syntactic-ws + (min (+ (point) 2000) (point-max))) (and cfd-continue-pos (< cfd-continue-pos (point)) (setq cfd-token-pos (point)))) @@ -6654,7 +6701,8 @@ c-find-decl-spots ;; can't be nested, and that's already been done in ;; `c-find-decl-prefix-search'. (when (> cfd-continue-pos cfd-token-pos) - (c-forward-syntactic-ws) + (c-forward-syntactic-ws + (min (+ (point) 2000) (point-max))) (setq cfd-token-pos (point))) ;; Continue if the following token fails the @@ -8817,7 +8865,7 @@ c-back-over-member-initializer-braces (or res (goto-char here)) res)) -(defmacro c-back-over-list-of-member-inits () +(defmacro c-back-over-list-of-member-inits (limit) ;; Go back over a list of elements, each looking like: ;; <symbol> (<expression>) , ;; or <symbol> {<expression>} , (with possibly a <....> expressions @@ -8826,21 +8874,21 @@ c-back-over-list-of-member-inits ;; a comma. If either of <symbol> or bracketed <expression> is missing, ;; throw nil to 'level. If the terminating } or ) is unmatched, throw nil ;; to 'done. This is not a general purpose macro! - '(while (eq (char-before) ?,) + `(while (eq (char-before) ?,) (backward-char) - (c-backward-syntactic-ws) + (c-backward-syntactic-ws ,limit) (when (not (memq (char-before) '(?\) ?}))) (throw 'level nil)) (when (not (c-go-list-backward)) (throw 'done nil)) - (c-backward-syntactic-ws) + (c-backward-syntactic-ws ,limit) (while (eq (char-before) ?>) (when (not (c-backward-<>-arglist nil)) (throw 'done nil)) - (c-backward-syntactic-ws)) + (c-backward-syntactic-ws ,limit)) (when (not (c-back-over-compound-identifier)) (throw 'level nil)) - (c-backward-syntactic-ws))) + (c-backward-syntactic-ws ,limit))) (defun c-back-over-member-initializers (&optional limit) ;; Test whether we are in a C++ member initializer list, and if so, go back @@ -8859,14 +8907,14 @@ c-back-over-member-initializers (catch 'done (setq level-plausible (catch 'level - (c-backward-syntactic-ws) + (c-backward-syntactic-ws limit) (when (memq (char-before) '(?\) ?})) (when (not (c-go-list-backward)) (throw 'done nil)) - (c-backward-syntactic-ws)) + (c-backward-syntactic-ws limit)) (when (c-back-over-compound-identifier) - (c-backward-syntactic-ws)) - (c-back-over-list-of-member-inits) + (c-backward-syntactic-ws limit)) + (c-back-over-list-of-member-inits limit) (and (eq (char-before) ?:) (save-excursion (c-backward-token-2) @@ -8880,14 +8928,14 @@ c-back-over-member-initializers (setq level-plausible (catch 'level (goto-char pos) - (c-backward-syntactic-ws) + (c-backward-syntactic-ws limit) (when (not (c-back-over-compound-identifier)) (throw 'level nil)) - (c-backward-syntactic-ws) - (c-back-over-list-of-member-inits) + (c-backward-syntactic-ws limit) + (c-back-over-list-of-member-inits limit) (and (eq (char-before) ?:) (save-excursion - (c-backward-token-2) + (c-backward-token-2 nil nil limit) (not (looking-at c-:$-multichar-token-regexp))) (c-just-after-func-arglist-p))))) @@ -12012,7 +12060,7 @@ c-looking-at-inexpr-block (goto-char haskell-op-pos)) (while (and (eq res 'maybe) - (progn (c-backward-syntactic-ws) + (progn (c-backward-syntactic-ws lim) (> (point) closest-lim)) (not (bobp)) (progn (backward-char) @@ -12783,7 +12831,7 @@ c-guess-basic-syntax (setq paren-state (cons containing-sexp paren-state) containing-sexp nil))) (setq lim (1+ containing-sexp)))) - (setq lim (point-min))) + (setq lim (c-determine-limit 1000))) ;; If we're in a parenthesis list then ',' delimits the ;; "statements" rather than being an operator (with the @@ -13025,7 +13073,9 @@ c-guess-basic-syntax ;; CASE 4: In-expression statement. C.f. cases 7B, 16A and ;; 17E. ((setq placeholder (c-looking-at-inexpr-block - (c-safe-position containing-sexp paren-state) + (or + (c-safe-position containing-sexp paren-state) + (c-determine-limit 1000 containing-sexp)) containing-sexp ;; Have to turn on the heuristics after ;; the point even though it doesn't work @@ -13150,7 +13200,8 @@ c-guess-basic-syntax ;; init lists can, in practice, be very large. ((save-excursion (when (and (c-major-mode-is 'c++-mode) - (setq placeholder (c-back-over-member-initializers))) + (setq placeholder (c-back-over-member-initializers + lim))) (setq tmp-pos (point)))) (if (= (c-point 'bosws) (1+ tmp-pos)) (progn @@ -13469,7 +13520,7 @@ c-guess-basic-syntax ;; CASE 5I: ObjC method definition. ((and c-opt-method-key (looking-at c-opt-method-key)) - (c-beginning-of-statement-1 nil t) + (c-beginning-of-statement-1 (c-determine-limit 1000) t) (if (= (point) indent-point) ;; Handle the case when it's the first (non-comment) ;; thing in the buffer. Can't look for a 'same return @@ -13542,7 +13593,16 @@ c-guess-basic-syntax (if (>= (point) indent-point) (throw 'not-in-directive t)) (setq placeholder (point))) - nil))))) + nil)) + (and macro-start + (not (c-beginning-of-statement-1 lim nil nil nil t)) + (setq placeholder + (let ((ps-top (car paren-state))) + (if (consp ps-top) + (progn + (goto-char (cdr ps-top)) + (c-forward-syntactic-ws indent-point)) + (point-min)))))))) ;; For historic reasons we anchor at bol of the last ;; line of the previous declaration. That's clearly ;; highly bogus and useless, and it makes our lives hard @@ -13591,19 +13651,30 @@ c-guess-basic-syntax (eq (char-before) ?<) (not (and c-overloadable-operators-regexp (c-after-special-operator-id lim)))) - (c-beginning-of-statement-1 (c-safe-position (point) paren-state)) + (c-beginning-of-statement-1 + (or + (c-safe-position (point) paren-state) + (c-determine-limit 1000))) (c-add-syntax 'template-args-cont (c-point 'boi))) ;; CASE 5Q: we are at a statement within a macro. - (macro-start - (c-beginning-of-statement-1 containing-sexp) + ((and + macro-start + (save-excursion + (prog1 + (not (eq (c-beginning-of-statement-1 + (or containing-sexp (c-determine-limit 1000)) + nil nil nil t) + nil))) + (setq placeholder (point)))) + (goto-char placeholder) (c-add-stmt-syntax 'statement nil t containing-sexp paren-state)) - ;;CASE 5N: We are at a topmost continuation line and the only + ;;CASE 5S: We are at a topmost continuation line and the only ;;preceding items are annotations. ((and (c-major-mode-is 'java-mode) (setq placeholder (point)) - (c-beginning-of-statement-1) + (c-beginning-of-statement-1 lim) (progn (while (and (c-forward-annotation)) (c-forward-syntactic-ws)) @@ -13615,7 +13686,9 @@ c-guess-basic-syntax ;; CASE 5M: we are at a topmost continuation line (t - (c-beginning-of-statement-1 (c-safe-position (point) paren-state)) + (c-beginning-of-statement-1 + (or (c-safe-position (point) paren-state) + (c-determine-limit 1000))) (when (c-major-mode-is 'objc-mode) (setq placeholder (point)) (while (and (c-forward-objc-directive) @@ -13671,8 +13744,9 @@ c-guess-basic-syntax (setq tmpsymbol '(block-open . inexpr-statement) placeholder (cdr-safe (c-looking-at-inexpr-block - (c-safe-position containing-sexp - paren-state) + (or + (c-safe-position containing-sexp paren-state) + (c-determine-limit 1000 containing-sexp)) containing-sexp))) ;; placeholder is nil if it's a block directly in ;; a function arglist. That makes us skip out of @@ -13804,7 +13878,9 @@ c-guess-basic-syntax (setq placeholder (c-guess-basic-syntax)))) (setq c-syntactic-context placeholder) (c-beginning-of-statement-1 - (c-safe-position (1- containing-sexp) paren-state)) + (or + (c-safe-position (1- containing-sexp) paren-state) + (c-determine-limit 1000 (1- containing-sexp)))) (c-forward-token-2 0) (while (cond ((looking-at c-specifier-key) @@ -13838,7 +13914,8 @@ c-guess-basic-syntax (c-add-syntax 'brace-list-close (point)) (setq lim (or (save-excursion (and - (c-back-over-member-initializers) + (c-back-over-member-initializers + (c-determine-limit 1000)) (point))) (c-most-enclosing-brace state-cache (point)))) (c-beginning-of-statement-1 lim nil nil t) @@ -13871,7 +13948,8 @@ c-guess-basic-syntax (c-add-syntax 'brace-list-intro (point)) (setq lim (or (save-excursion (and - (c-back-over-member-initializers) + (c-back-over-member-initializers + (c-determine-limit 1000)) (point))) (c-most-enclosing-brace state-cache (point)))) (c-beginning-of-statement-1 lim nil nil t) @@ -13927,7 +14005,9 @@ c-guess-basic-syntax ;; CASE 16A: closing a lambda defun or an in-expression ;; block? C.f. cases 4, 7B and 17E. ((setq placeholder (c-looking-at-inexpr-block - (c-safe-position containing-sexp paren-state) + (or + (c-safe-position containing-sexp paren-state) + (c-determine-limit 1000 containing-sexp)) nil)) (setq tmpsymbol (if (eq (car placeholder) 'inlambda) 'inline-close @@ -14090,7 +14170,9 @@ c-guess-basic-syntax ;; CASE 17E: first statement in an in-expression block. ;; C.f. cases 4, 7B and 16A. ((setq placeholder (c-looking-at-inexpr-block - (c-safe-position containing-sexp paren-state) + (or + (c-safe-position containing-sexp paren-state) + (c-determine-limit 1000 containing-sexp)) nil)) (setq tmpsymbol (if (eq (car placeholder) 'inlambda) 'defun-block-intro diff --git a/lisp/progmodes/cc-fonts.el b/lisp/progmodes/cc-fonts.el index bb7e5bea6e..166cbd7a49 100644 --- a/lisp/progmodes/cc-fonts.el +++ b/lisp/progmodes/cc-fonts.el @@ -947,7 +947,7 @@ c-font-lock-complex-decl-prepare ;; closest token before the region. (save-excursion (let ((pos (point))) - (c-backward-syntactic-ws) + (c-backward-syntactic-ws (max (- (point) 500) (point-min))) (c-clear-char-properties (if (and (not (bobp)) (memq (c-get-char-property (1- (point)) 'c-type) @@ -969,7 +969,7 @@ c-font-lock-complex-decl-prepare ;; The declared identifiers are font-locked correctly as types, if ;; that is what they are. (let ((prop (save-excursion - (c-backward-syntactic-ws) + (c-backward-syntactic-ws (max (- (point) 500) (point-min))) (unless (bobp) (c-get-char-property (1- (point)) 'c-type))))) (when (memq prop '(c-decl-id-start c-decl-type-start)) @@ -1008,15 +1008,24 @@ c-font-lock-<>-arglists (boundp 'parse-sexp-lookup-properties))) (c-parse-and-markup-<>-arglists t) c-restricted-<>-arglists - id-start id-end id-face pos kwd-sym) + id-start id-end id-face pos kwd-sym + old-pos) (while (and (< (point) limit) - (re-search-forward c-opt-<>-arglist-start limit t)) - - (setq id-start (match-beginning 1) - id-end (match-end 1) - pos (point)) - + (setq old-pos (point)) + (c-syntactic-re-search-forward "<" limit t nil t)) + (setq pos (point)) + (save-excursion + (backward-char) + (c-backward-syntactic-ws old-pos) + (if (re-search-backward + (concat "\\(\\`\\|" c-nonsymbol-key "\\)\\(" c-symbol-key"\\)\\=") + old-pos t) + (setq id-start (match-beginning 2) + id-end (match-end 2)) + (setq id-start nil id-end nil))) + + (when id-start (goto-char id-start) (unless (c-skip-comments-and-strings limit) (setq kwd-sym nil @@ -1033,7 +1042,7 @@ c-font-lock-<>-arglists (when (looking-at c-opt-<>-sexp-key) ;; There's a special keyword before the "<" that tells ;; that it's an angle bracket arglist. - (setq kwd-sym (c-keyword-sym (match-string 1))))) + (setq kwd-sym (c-keyword-sym (match-string 2))))) (t ;; There's a normal identifier before the "<". If we're not in @@ -1067,7 +1076,7 @@ c-font-lock-<>-arglists 'font-lock-type-face)))))) (goto-char pos))) - (goto-char pos)))))) + (goto-char pos))))))) nil) (defun c-font-lock-declarators (limit list types not-top @@ -1496,7 +1505,8 @@ c-font-lock-declarations ;; Check we haven't missed a preceding "typedef". (when (not (looking-at c-typedef-key)) - (c-backward-syntactic-ws) + (c-backward-syntactic-ws + (max (- (point) 1000) (point-min))) (c-backward-token-2) (or (looking-at c-typedef-key) (goto-char start-pos))) @@ -1536,8 +1546,10 @@ c-font-lock-declarations (c-backward-token-2) (and (not (looking-at c-opt-<>-sexp-key)) - (progn (c-backward-syntactic-ws) - (memq (char-before) '(?\( ?,))) + (progn + (c-backward-syntactic-ws + (max (- (point) 1000) (point-min))) + (memq (char-before) '(?\( ?,))) (not (eq (c-get-char-property (1- (point)) 'c-type) 'c-decl-arg-start)))))) @@ -2295,7 +2307,8 @@ c-font-lock-c++-using (and c-colon-type-list-re (c-go-up-list-backward) (eq (char-after) ?{) - (eq (car (c-beginning-of-decl-1)) 'same) + (eq (car (c-beginning-of-decl-1 + (c-determine-limit 1000))) 'same) (looking-at c-colon-type-list-re))) ;; Inherited protected member: leave unfontified ) diff --git a/lisp/progmodes/cc-langs.el b/lisp/progmodes/cc-langs.el index d6089ea295..4d1aeaa5cb 100644 --- a/lisp/progmodes/cc-langs.el +++ b/lisp/progmodes/cc-langs.el @@ -699,6 +699,7 @@ c-populate-syntax-table ;; The same thing regarding Unicode identifiers applies here as to ;; `c-symbol-key'. t (concat "[" (c-lang-const c-nonsymbol-chars) "]")) +(c-lang-defvar c-nonsymbol-key (c-lang-const c-nonsymbol-key)) (c-lang-defconst c-identifier-ops "The operators that make up fully qualified identifiers. nil in diff --git a/lisp/progmodes/cc-mode.el b/lisp/progmodes/cc-mode.el index c5201d1af5..df9709df94 100644 --- a/lisp/progmodes/cc-mode.el +++ b/lisp/progmodes/cc-mode.el @@ -499,11 +499,14 @@ c-unfind-coalesced-tokens (save-excursion (when (< beg end) (goto-char beg) + (let ((lim (c-determine-limit 1000)) + (lim+ (c-determine-+ve-limit 1000 end))) (when (and (not (bobp)) - (progn (c-backward-syntactic-ws) (eq (point) beg)) + (progn (c-backward-syntactic-ws lim) (eq (point) beg)) (/= (skip-chars-backward c-symbol-chars (1- (point))) 0) - (progn (goto-char beg) (c-forward-syntactic-ws) (<= (point) end)) + (progn (goto-char beg) (c-forward-syntactic-ws lim+) + (<= (point) end)) (> (point) beg) (goto-char end) (looking-at c-symbol-char-key)) @@ -514,14 +517,14 @@ c-unfind-coalesced-tokens (goto-char end) (when (and (not (eobp)) - (progn (c-forward-syntactic-ws) (eq (point) end)) + (progn (c-forward-syntactic-ws lim+) (eq (point) end)) (looking-at c-symbol-char-key) - (progn (c-backward-syntactic-ws) (>= (point) beg)) + (progn (c-backward-syntactic-ws lim) (>= (point) beg)) (< (point) end) (/= (skip-chars-backward c-symbol-chars (1- (point))) 0)) (goto-char (1+ end)) (c-end-of-current-token) - (c-unfind-type (buffer-substring-no-properties end (point))))))) + (c-unfind-type (buffer-substring-no-properties end (point)))))))) ;; c-maybe-stale-found-type records a place near the region being ;; changed where an element of `found-types' might become stale. It @@ -1993,10 +1996,10 @@ c-before-change ;; inserting stuff after "foo" in "foo bar;", or ;; before "foo" in "typedef foo *bar;"? ;; - ;; We search for appropriate c-type properties "near" - ;; the change. First, find an appropriate boundary - ;; for this property search. - (let (lim + ;; We search for appropriate c-type properties "near" the + ;; change. First, find an appropriate boundary for this + ;; property search. + (let (lim lim-2 type type-pos marked-id term-pos (end1 @@ -2007,8 +2010,11 @@ c-before-change (when (>= end1 beg) ; Don't hassle about changes entirely in ; comments. ;; Find a limit for the search for a `c-type' property + ;; Point is currently undefined. A `goto-char' somewhere is needed. (2020-12-06). + (setq lim-2 (c-determine-limit 1000 (point) ; that is wrong. FIXME!!! (2020-12-06) + )) (while - (and (/= (skip-chars-backward "^;{}") 0) + (and (/= (skip-chars-backward "^;{}" lim-2) 0) (> (point) (point-min)) (memq (c-get-char-property (1- (point)) 'face) '(font-lock-comment-face font-lock-string-face)))) @@ -2032,7 +2038,8 @@ c-before-change (buffer-substring-no-properties (point) type-pos))) (goto-char end1) - (skip-chars-forward "^;{}") ; FIXME!!! loop for + (setq lim-2 (c-determine-+ve-limit 1000)) + (skip-chars-forward "^;{}" lim-2) ; FIXME!!! loop for ; comment, maybe (setq lim (point)) (setq term-pos @@ -2270,9 +2277,11 @@ c-fl-decl-end ;; preserved. (goto-char pos) (let ((lit-start (c-literal-start)) + (lim (c-determine-limit 1000)) enclosing-attribute pos1) (unless lit-start - (c-backward-syntactic-ws) + (c-backward-syntactic-ws + lim) (when (setq enclosing-attribute (c-enclosing-c++-attribute)) (goto-char (car enclosing-attribute))) ; Only happens in C++ Mode. (when (setq pos1 (c-on-identifier)) @@ -2296,14 +2305,14 @@ c-fl-decl-end (setq pos1 (c-on-identifier)) (goto-char pos1) (progn - (c-backward-syntactic-ws) + (c-backward-syntactic-ws lim) (eq (char-before) ?\()) (c-fl-decl-end (1- (point)))) - (c-backward-syntactic-ws) + (c-backward-syntactic-ws lim) (point)))) (and (progn (c-forward-syntactic-ws lim) (not (eobp))) - (c-backward-syntactic-ws) + (c-backward-syntactic-ws lim) (point))))))))) (defun c-change-expand-fl-region (_beg _end _old-len) -- Alan Mackenzie (Nuremberg, Germany).
bug-gnu-emacs <at> gnu.org, bug-cc-mode <at> gnu.org
:bug#25706
; Package emacs,cc-mode
.
(Tue, 08 Dec 2020 19:33:01 GMT) Full text and rfc822 format available.Message #98 received at 25706 <at> debbugs.gnu.org (full text, mbox):
From: Mattias Engdegård <mattiase <at> acm.org> To: Alan Mackenzie <acm <at> muc.de> Cc: Lars Ingebrigtsen <larsi <at> gnus.org>, 25706 <at> debbugs.gnu.org Subject: Re: bug#25706: 26.0.50; Slow C file fontification Date: Tue, 8 Dec 2020 20:32:12 +0100
Hello Alan, 8 dec. 2020 kl. 19.42 skrev Alan Mackenzie <acm <at> muc.de>: > That's 133,000 lines, give or take. Even our largest file, > src/xdisp.c is only 36,000 lines. I don't understand how a file > describing hardware can come to anything like 133k lines. It must be > soul destroying to have to write a driver based on a file like this. > That file was put together by AMD, and I suspect they didn't take all > that much care to make it usable. Those files are likely not hand-written but generated from a hardware description language where device registers are declared in more comfortable ways, and often are part of or at least have tie-ins to VLSI synthesis tools. Nevertheless, there are quite big files that are crafted by hand, and in any case users need to look at them sooner or later in an editor anyway (hence the bug report), so the speed-up job here is essential and benefits everyone. > There's one thing which still puzzles me. In osprey_reg....h, when > scrolling through it (e.g. with (time-scroll)), it stutters markedly at > around 13% of the way through. Tried applying my regexp patch? It should reduce the pain, which may indicate that the stuttering is caused by severe regexp backtracking effects. > Anyhow, please try out the (?)final version of my patch before I commit > it and close the bug. It should apply cleanly to the master branch. I > might well split it into three changes, two small, one large, since > there are, in a sense three distinct fixes there. Thank you very much, I'll take a look, and as promised I'll put together a more detailed guide to what I think could be done about some of the regexps.
bug-gnu-emacs <at> gnu.org, bug-cc-mode <at> gnu.org
:bug#25706
; Package emacs,cc-mode
.
(Wed, 09 Dec 2020 07:46:01 GMT) Full text and rfc822 format available.Message #101 received at 25706 <at> debbugs.gnu.org (full text, mbox):
From: Ravine Var <ravine.var <at> gmail.com> To: Alan Mackenzie <acm <at> muc.de> Cc: Mattias Engdegård <mattiase <at> acm.org>, Lars Ingebrigtsen <larsi <at> gnus.org>, 25706 <at> debbugs.gnu.org Subject: Re: bug#25706: 26.0.50; Slow C file fontification Date: Wed, 09 Dec 2020 13:01:31 +0530
Alan Mackenzie <acm <at> muc.de> writes: > Anyhow, please try out the (?)final version of my patch before I commit > it and close the bug. It should apply cleanly to the master branch. I > might well split it into three changes, two small, one large, since > there are, in a sense three distinct fixes there. I tested this patch, along with Mattias' patch posted earlier, on two machines. On a reasonably fast machine (AMD Ryzen 3 3200G with 16 GB RAM), there is a marked improvement in visiting and scrolling the header files in the linux kernel tree. The complete lockups that happened earlier did not happen. I also tested the patches on a Chromebook (Intel Celeron N2840 with 4GB RAM), which is similar to the machine in the original report. Unfortunately, the behavior was still bad, with lockups and freezing. I tried both c-mode and c++-mode with font-lock-maximum-decoration set to 2.
bug-gnu-emacs <at> gnu.org, bug-cc-mode <at> gnu.org
:bug#25706
; Package emacs,cc-mode
.
(Wed, 09 Dec 2020 07:59:01 GMT) Full text and rfc822 format available.Message #104 received at 25706 <at> debbugs.gnu.org (full text, mbox):
From: Ravine Var <ravine.var <at> gmail.com> To: 25706 <at> debbugs.gnu.org Subject: Re: bug#25706: 26.0.50; Slow C file fontification Date: Wed, 09 Dec 2020 13:17:23 +0530
I came across another place where a similar lockup happens (even with the patches posted here). https://gitlab.com/wireshark/wireshark/-/raw/master/epan/dissectors/packet-rrc.c Towards the end of the file, once we get to the function proto_register_rrc(void), the slowdown of scrolling starts and eventually things freeze. Just copying that function to a smaller C file is enough to reproduce the issue. (I found that C-M-h is a nifty command to do this.) I can open a new bug report if required.
bug-gnu-emacs <at> gnu.org, bug-cc-mode <at> gnu.org
:bug#25706
; Package emacs,cc-mode
.
(Wed, 09 Dec 2020 17:01:02 GMT) Full text and rfc822 format available.Message #107 received at 25706 <at> debbugs.gnu.org (full text, mbox):
From: Mattias Engdegård <mattiase <at> acm.org> To: Alan Mackenzie <acm <at> muc.de> Cc: Lars Ingebrigtsen <larsi <at> gnus.org>, 25706 <at> debbugs.gnu.org Subject: Re: bug#25706: 26.0.50; Slow C file fontification Date: Wed, 9 Dec 2020 18:00:30 +0100
[Message part 1 (text/plain, inline)]
First, some Emacs regexp basics: 1. If A and B match single characters, then A\|B should be written [AB] whenever possible. The reason is that A\|B adds a backtrack record which uses stack space and wastes time if matching fails later on. The cost can be quite noticeable, which we have seen. 2. Syntax-class constructs are usually better written as character alternatives when possible. The \sX construct, for some X, is typically somewhat slower to match than explicitly listing the characters to match. For example, if all you care about are space and tab, then "\\s *" should be written "[ \t]*". 3. Unicode character classes are slower to match than ASCII-only ones. For example, [[:alpha:]] is slower than [A-Za-z], assuming only those characters are of interest. 4. [^...] will match \n unless included in the set. For example, "[^a]\\|$" will almost never match the $ (end-of-line) branch, because a newline will be matched by the first branch. The only exception is at the very end of the buffer if it is not newline-terminated, but that is rarely worth considering for source code. 5. \r (carriage return) normally doesn't appear in buffers even if the file uses DOS line endings. Line endings are converted into a single \n (newline) when the buffer is read. In particular, $ does NOT match at \r, only before \n. When \r appears it is usually because the file contains a mixture of line-ending styles, typically from being edited using broken tools. Whether you want to take such files into account is a matter of judgement; most modes don't bother. 6. Capturing groups costs more than non-capturing groups, but you already know that. On to specifics: here are annotations for possible improvements in cc-langs.el. (I didn't bother about capturing groups here.)
[cc-regexp-annot.diff (application/octet-stream, attachment)]
bug-gnu-emacs <at> gnu.org, bug-cc-mode <at> gnu.org
:bug#25706
; Package emacs,cc-mode
.
(Wed, 09 Dec 2020 18:48:01 GMT) Full text and rfc822 format available.Message #110 received at 25706 <at> debbugs.gnu.org (full text, mbox):
From: Alan Mackenzie <acm <at> muc.de> To: Ravine Var <ravine.var <at> gmail.com> Cc: Mattias Engdegård <mattiase <at> acm.org>, Lars Ingebrigtsen <larsi <at> gnus.org>, 25706 <at> debbugs.gnu.org Subject: Re: bug#25706: 26.0.50; Slow C file fontification Date: Wed, 9 Dec 2020 18:46:55 +0000
Hello, Ravine. Thanks for doing all this testing! On Wed, Dec 09, 2020 at 13:01:31 +0530, Ravine Var wrote: > Alan Mackenzie <acm <at> muc.de> writes: > > Anyhow, please try out the (?)final version of my patch before I commit > > it and close the bug. It should apply cleanly to the master branch. I > > might well split it into three changes, two small, one large, since > > there are, in a sense three distinct fixes there. > I tested this patch, along with Mattias' patch posted earlier, on two > machines. > On a reasonably fast machine (AMD Ryzen 3 3200G with 16 GB RAM), there > is a marked improvement in visiting and scrolling the header files > in the linux kernel tree. The complete lockups that happened earlier > did not happen. That is close to the spec of my machine, and I find that these large .h files (without braces), with the patch, now work fast enough for me. > I also tested the patches on a Chromebook (Intel Celeron N2840 with 4GB > RAM), which is similar to the machine in the original report. > Unfortunately, the behavior was still bad, with lockups and freezing. > I tried both c-mode and c++-mode with font-lock-maximum-decoration set > to 2. Thank you indeed for taking the trouble to test the patch on the lesser machine. I do not have access to such a machine. I am assuming that before this patch, such a large file like osprey_reg....h would have been completely unworkable on the machine. It sounds as though it still is. However, have you noticed any improvement at all in performance? Could I ask you please to do one more thing, and that is to take a profile on this machine where it is giving trouble. From a freshly loaded buffer, move forward (if necessary) to a troublesome spot. N.B. C-u 1 M-> moves to 10% away from the end of the buffer, C-u 2 M-> 20%, and so on. Then start the profiler and do what is causing sluggish performance. Then have a look at the final profiler output, and expand it sensibly so that the troublesome function can be found. (Optional paragraph.) How to use the profiler: Do M-x profiler-start RET, and accept the default mode with another RET. Perform the stuff to be profiled. Do M-x profiler-report, which gives three or four lines of output, each with a number and a percentage. Move point to a line with a large percentage and type RET to expand it. You can repeat this to expand further. Please expand the lines down to where the percentages remaining are around 5% or 6%. There will be quite a lot of lines near the start showing the same large percentage. Then could you please post that output here, so as to give me some idea of where the poor performance is coming from. Thanks! -- Alan Mackenzie (Nuremberg, Germany).
bug-gnu-emacs <at> gnu.org, bug-cc-mode <at> gnu.org
:bug#25706
; Package emacs,cc-mode
.
(Wed, 09 Dec 2020 20:05:01 GMT) Full text and rfc822 format available.Message #113 received at 25706 <at> debbugs.gnu.org (full text, mbox):
From: Eli Zaretskii <eliz <at> gnu.org> To: Alan Mackenzie <acm <at> muc.de> Cc: ravine.var <at> gmail.com, larsi <at> gnus.org, 25706 <at> debbugs.gnu.org, mattiase <at> acm.org Subject: Re: bug#25706: 26.0.50; Slow C file fontification Date: Wed, 09 Dec 2020 22:04:20 +0200
> Date: Wed, 9 Dec 2020 18:46:55 +0000 > From: Alan Mackenzie <acm <at> muc.de> > Cc: Mattias Engdegård <mattiase <at> acm.org>, > Lars Ingebrigtsen <larsi <at> gnus.org>, 25706 <at> debbugs.gnu.org > > Move point to a line with a large percentage and type RET to expand > it. You can repeat this to expand further. Please expand the lines > down to where the percentages remaining are around 5% or 6%. There > will be quite a lot of lines near the start showing the same large > percentage. One can also expand everything with "C-u RET".
bug-gnu-emacs <at> gnu.org, bug-cc-mode <at> gnu.org
:bug#25706
; Package emacs,cc-mode
.
(Wed, 09 Dec 2020 20:33:02 GMT) Full text and rfc822 format available.Message #116 received at 25706 <at> debbugs.gnu.org (full text, mbox):
From: Alan Mackenzie <acm <at> muc.de> To: Eli Zaretskii <eliz <at> gnu.org> Cc: ravine.var <at> gmail.com, larsi <at> gnus.org, 25706 <at> debbugs.gnu.org, mattiase <at> acm.org Subject: Re: bug#25706: 26.0.50; Slow C file fontification Date: Wed, 9 Dec 2020 20:32:05 +0000
Hello, Eli. On Wed, Dec 09, 2020 at 22:04:20 +0200, Eli Zaretskii wrote: > > Date: Wed, 9 Dec 2020 18:46:55 +0000 > > From: Alan Mackenzie <acm <at> muc.de> > > Cc: Mattias Engdegård <mattiase <at> acm.org>, > > Lars Ingebrigtsen <larsi <at> gnus.org>, 25706 <at> debbugs.gnu.or > > Move point to a line with a large percentage and type RET to expand > > it. You can repeat this to expand further. Please expand the lines > > down to where the percentages remaining are around 5% or 6%. There > > will be quite a lot of lines near the start showing the same large > > percentage. > One can also expand everything with "C-u RET". Thanks. I didn't know that. I don't think that's in the Elisp manual. Also useful would be a command to expand "everything which is sufficiently big" for some value of "sufficiently big", to avoid swathes of irrelevancies down at 1% or 0%. I once tried to amend the profiler to move its statistics columns further to the right, because I was seeing far too many truncated function names. But I gave up, because the code was masses and masses of tiny functions, largely without doc strings or comments, and I just couldn't make sense of it. -- Alan Mackenzie (Nuremberg, Germany).
bug-gnu-emacs <at> gnu.org, bug-cc-mode <at> gnu.org
:bug#25706
; Package emacs,cc-mode
.
(Thu, 10 Dec 2020 08:09:02 GMT) Full text and rfc822 format available.Message #119 received at 25706 <at> debbugs.gnu.org (full text, mbox):
From: Alan Mackenzie <acm <at> muc.de> To: Ravine Var <ravine.var <at> gmail.com> Cc: 25706 <at> debbugs.gnu.org Subject: Re: bug#25706: 26.0.50; Slow C file fontification Date: Thu, 10 Dec 2020 08:08:44 +0000
Hello again, Ravine. On Wed, Dec 09, 2020 at 13:17:23 +0530, Ravine Var wrote: > I came across another place where a similar lockup happens > (even with the patches posted here). > https://gitlab.com/wireshark/wireshark/-/raw/master/epan/dissectors/packet-rrc.c > Towards the end of the file, once we get to the function > proto_register_rrc(void), the slowdown of scrolling starts and eventually > things freeze. Outch! That's a 50,000 line long function. ;-( I've lost some naivety about "reasonableness" in the past week or two. > Just copying that function to a smaller C file is enough to > reproduce the issue. (I found that C-M-h is a nifty command to do this.) > I can open a new bug report if required. Would you do this, please. The mechanism for the slowdown in that function is entirely different from that in the .h files with lots of macros. In the .c file, there are lots and lots of braces, and it seems we need a new cache to handle them faster. In the .h files, there are no braces, and we needed to put limits into backward searches. Thanks again for taking the trouble to report all these bugs. -- Alan Mackenzie (Nuremberg, Germany).
bug-gnu-emacs <at> gnu.org, bug-cc-mode <at> gnu.org
:bug#25706
; Package emacs,cc-mode
.
(Thu, 10 Dec 2020 12:27:01 GMT) Full text and rfc822 format available.Message #122 received at 25706 <at> debbugs.gnu.org (full text, mbox):
From: Alan Mackenzie <acm <at> muc.de> To: Mattias Engdegård <mattiase <at> acm.org> Cc: Lars Ingebrigtsen <larsi <at> gnus.org>, 25706 <at> debbugs.gnu.org Subject: Re: bug#25706: 26.0.50; Slow C file fontification Date: Thu, 10 Dec 2020 12:26:48 +0000
Hello, Mattias. Thanks for this! On Wed, Dec 09, 2020 at 18:00:30 +0100, Mattias Engdegård wrote: > First, some Emacs regexp basics: > 1. If A and B match single characters, then A\|B should be written > [AB] whenever possible. The reason is that A\|B adds a backtrack > record which uses stack space and wastes time if matching fails later > on. The cost can be quite noticeable, which we have seen. > 2. Syntax-class constructs are usually better written as character > alternatives when possible. > The \sX construct, for some X, is typically somewhat slower to match > than explicitly listing the characters to match. For example, if all > you care about are space and tab, then "\\s *" should be written "[ > \t]*". > 3. Unicode character classes are slower to match than ASCII-only ones. > For example, [[:alpha:]] is slower than [A-Za-z], assuming only those > characters are of interest. > 4. [^...] will match \n unless included in the set. For example, > "[^a]\\|$" will almost never match the $ (end-of-line) branch, because > a newline will be matched by the first branch. The only exception is > at the very end of the buffer if it is not newline-terminated, but > that is rarely worth considering for source code. > 5. \r (carriage return) normally doesn't appear in buffers even if the > file uses DOS line endings. Line endings are converted into a single > \n (newline) when the buffer is read. In particular, $ does NOT match > at \r, only before \n. > When \r appears it is usually because the file contains a mixture of > line-ending styles, typically from being edited using broken tools. > Whether you want to take such files into account is a matter of > judgement; most modes don't bother. > 6. Capturing groups costs more than non-capturing groups, but you > already know that. > On to specifics: here are annotations for possible improvements in > cc-langs.el. (I didn't bother about capturing groups here.) I think we should get around to fixing the regexps in CC Mode soon. But I think I would rather do this as a separate exercise, since the patch for this bug is already around 800 lines and Ravine Var, the OP, has found further problems on a slowish machine. In particular, some of the fixes in your patch relate to the CPP constructs, and they might well be slowing down that regexp in c-find-decl-spots I highlighted earlier. So I'm keen to look at this again, once the current bug is settled. -- Alan Mackenzie (Nuremberg, Germany).
bug-gnu-emacs <at> gnu.org, bug-cc-mode <at> gnu.org
:bug#25706
; Package emacs,cc-mode
.
(Thu, 10 Dec 2020 17:13:02 GMT) Full text and rfc822 format available.Message #125 received at 25706 <at> debbugs.gnu.org (full text, mbox):
From: Ravine Var <ravine.var <at> gmail.com> To: Alan Mackenzie <acm <at> muc.de> Cc: Mattias Engdegård <mattiase <at> acm.org>, Lars Ingebrigtsen <larsi <at> gnus.org>, 25706 <at> debbugs.gnu.org Subject: Re: bug#25706: 26.0.50; Slow C file fontification Date: Thu, 10 Dec 2020 22:32:17 +0530
> Thank you indeed for taking the trouble to test the patch on the lesser > machine. I do not have access to such a machine. I am assuming that > before this patch, such a large file like osprey_reg....h would have > been completely unworkable on the machine. It sounds as though it still > is. However, have you noticed any improvement at all in performance? There is a marginal improvement - recovery from scroll lockups are slightly faster. But, in general, working with the osprey header file is still very painful. > Could I ask you please to do one more thing, and that is to take a > profile on this machine where it is giving trouble. From a freshly > loaded buffer, move forward (if necessary) to a troublesome spot. N.B. > C-u 1 M-> moves to 10% away from the end of the buffer, C-u 2 M-> 20%, > and so on. Then start the profiler and do what is causing sluggish > performance. Then have a look at the final profiler output, and expand > it sensibly so that the troublesome function can be found. > > (Optional paragraph.) How to use the profiler: Do M-x profiler-start > RET, and accept the default mode with another RET. Perform the stuff to > be profiled. Do M-x profiler-report, which gives three or four lines of > output, each with a number and a percentage. Move point to a line with > a large percentage and type RET to expand it. You can repeat this to > expand further. Please expand the lines down to where the percentages > remaining are around 5% or 6%. There will be quite a lot of lines near > the start showing the same large percentage. I opened the osprey file and started scrolling down and the screen locked up. Here is the profile report (with emacs -Q): https://gist.github.com/ravine-var/0c293968a902cde76af77f2872dde1d7 I am using emacs master (along with your patch) built with LTO enabled and CFLAGS set to '-O2 -march=native'.
bug-gnu-emacs <at> gnu.org, bug-cc-mode <at> gnu.org
:bug#25706
; Package emacs,cc-mode
.
(Thu, 10 Dec 2020 20:03:01 GMT) Full text and rfc822 format available.Message #128 received at 25706 <at> debbugs.gnu.org (full text, mbox):
From: Alan Mackenzie <acm <at> muc.de> To: Ravine Var <ravine.var <at> gmail.com> Cc: Mattias Engdegård <mattiase <at> acm.org>, Lars Ingebrigtsen <larsi <at> gnus.org>, 25706 <at> debbugs.gnu.org Subject: Re: bug#25706: 26.0.50; Slow C file fontification Date: Thu, 10 Dec 2020 20:02:44 +0000
Hello, Ravine. On Thu, Dec 10, 2020 at 22:32:17 +0530, Ravine Var wrote: > > Thank you indeed for taking the trouble to test the patch on the lesser > > machine. I do not have access to such a machine. I am assuming that > > before this patch, such a large file like osprey_reg....h would have > > been completely unworkable on the machine. It sounds as though it still > > is. However, have you noticed any improvement at all in performance? > There is a marginal improvement - recovery from scroll lockups are > slightly faster. But, in general, working with the osprey header > file is still very painful. OK, I still have some work to do, here. > > Could I ask you please to do one more thing, and that is to take a > > profile on this machine where it is giving trouble. From a freshly > > loaded buffer, move forward (if necessary) to a troublesome spot. N.B. > > C-u 1 M-> moves to 10% away from the end of the buffer, C-u 2 M-> 20%, > > and so on. Then start the profiler and do what is causing sluggish > > performance. Then have a look at the final profiler output, and expand > > it sensibly so that the troublesome function can be found. [ .... ] > I opened the osprey file and started scrolling down and the screen > locked up. Here is the profile report (with emacs -Q): > https://gist.github.com/ravine-var/0c293968a902cde76af77f2872dde1d7 Thanks. That was very helpful. I've still got to analyse it more deeply, but one thing that stood out (to me, at least), was c-forward-name taking up 13% of the run time in your profile. If we include the garbage collection this will have caused, it might be as high as 20% of the time, and that's right at the beginnning of your buffer. To fix this, can I ask you, please, to try adding the following patch to your already patched software, and let me know if it helps at all. If it does, that's great, if not, could I ask you to do another profile for me on the less powerful machine, say by opening the buffer, starting the profiler, then moving to the middle of the buffer with C-u 5 M->. This may take some time to profile. Thanks! > I am using emacs master (along with your patch) built with LTO enabled > and CFLAGS set to '-O2 -march=native'. That's the ideal testing setup. Here's that patch: diff -r 863d08a1858a cc-engine.el --- a/cc-engine.el Thu Nov 26 11:27:52 2020 +0000 +++ b/cc-engine.el Tue Dec 08 19:48:50 2020 +0000 @@ -8276,7 +8325,8 @@ ;; typically called from `c-forward-type' in this case, and ;; the caller only wants the top level type that it finds to ;; be promoted. - c-promote-possible-types) + c-promote-possible-types + (lim+ (c-determine-+ve-limit 500))) (while (and (looking-at c-identifier-key) @@ -8306,7 +8359,7 @@ ;; Handle a C++ operator or template identifier. (goto-char id-end) - (c-forward-syntactic-ws) + (c-forward-syntactic-ws lim+) (cond ((eq (char-before id-end) ?e) ;; Got "... ::template". (let ((subres (c-forward-name))) @@ -8336,13 +8389,13 @@ (looking-at "::") (progn (goto-char (match-end 0)) - (c-forward-syntactic-ws) + (c-forward-syntactic-ws lim+) (eq (char-after) ?*)) (progn (forward-char) t)))) (while (progn - (c-forward-syntactic-ws) + (c-forward-syntactic-ws lim+) (setq pos (point)) (looking-at c-opt-type-modifier-key)) (goto-char (match-end 1)))))) @@ -8352,7 +8405,7 @@ (setq c-last-identifier-range (cons (point) (match-end 0))) (goto-char (match-end 0)) - (c-forward-syntactic-ws) + (c-forward-syntactic-ws lim+) (setq pos (point) res 'operator))) @@ -8366,7 +8419,7 @@ (setq c-last-identifier-range (cons id-start id-end))) (goto-char id-end) - (c-forward-syntactic-ws) + (c-forward-syntactic-ws lim+) (setq pos (point) res t))) @@ -8382,7 +8435,7 @@ ;; cases with tricky syntactic whitespace that aren't ;; covered in `c-identifier-key'. (goto-char (match-end 0)) - (c-forward-syntactic-ws) + (c-forward-syntactic-ws lim+) t) ((and c-recognize-<>-arglists @@ -8391,7 +8444,7 @@ (when (let (c-last-identifier-range) (c-forward-<>-arglist nil)) - (c-forward-syntactic-ws) + (c-forward-syntactic-ws lim+) (unless (eq (char-after) ?\() (setq c-last-identifier-range nil) (c-add-type start (1+ pos))) @@ -8406,7 +8459,7 @@ (when (and c-record-type-identifiers id-start) (c-record-ref-id (cons id-start id-end))) (forward-char 2) - (c-forward-syntactic-ws) + (c-forward-syntactic-ws lim+) t) (when (and c-record-type-identifiers id-start -- Alan Mackenzie (Nuremberg, Germany).
bug-gnu-emacs <at> gnu.org, bug-cc-mode <at> gnu.org
:bug#25706
; Package emacs,cc-mode
.
(Fri, 11 Dec 2020 11:05:01 GMT) Full text and rfc822 format available.Message #131 received at 25706 <at> debbugs.gnu.org (full text, mbox):
From: Ravine Var <ravine.var <at> gmail.com> To: Alan Mackenzie <acm <at> muc.de> Cc: Mattias Engdegård <mattiase <at> acm.org>, Lars Ingebrigtsen <larsi <at> gnus.org>, 25706 <at> debbugs.gnu.org Subject: Re: bug#25706: 26.0.50; Slow C file fontification Date: Fri, 11 Dec 2020 16:25:20 +0530
Alan Mackenzie <acm <at> muc.de> writes: > To fix this, can I ask you, please, to try adding the following patch to > your already patched software, and let me know if it helps at all. If > it does, that's great, if not, could I ask you to do another profile for > me on the less powerful machine, say by opening the buffer, starting the > profiler, then moving to the middle of the buffer with C-u 5 M->. This > may take some time to profile. Thanks! Doing C-u 5 M-> just jumps to the middle immediately. The problem happens when the file is opened and I start scrolling with C-v. With the new patch, things are still bad - emacs freezes almost instantly. I tested with 3 patches applied from messages 35, 95 and 128. Here's the profile with emacs -Q : https://gist.github.com/ravine-var/48b3e1469ac5a7f3c3df8d6d9313661a
bug-gnu-emacs <at> gnu.org, bug-cc-mode <at> gnu.org
:bug#25706
; Package emacs,cc-mode
.
(Sat, 12 Dec 2020 15:35:02 GMT) Full text and rfc822 format available.Message #134 received at 25706 <at> debbugs.gnu.org (full text, mbox):
From: Alan Mackenzie <acm <at> muc.de> To: Ravine Var <ravine.var <at> gmail.com> Cc: Mattias Engdegård <mattiase <at> acm.org>, Lars Ingebrigtsen <larsi <at> gnus.org>, 25706 <at> debbugs.gnu.org Subject: Re: bug#25706: 26.0.50; Slow C file fontification Date: Sat, 12 Dec 2020 15:34:33 +0000
Hello, Ravine. On Fri, Dec 11, 2020 at 16:25:20 +0530, Ravine Var wrote: > Alan Mackenzie <acm <at> muc.de> writes: > > To fix this, can I ask you, please, to try adding the following patch to > > your already patched software, and let me know if it helps at all. If > > it does, that's great, if not, could I ask you to do another profile for > > me on the less powerful machine, say by opening the buffer, starting the > > profiler, then moving to the middle of the buffer with C-u 5 M->. This > > may take some time to profile. Thanks! > Doing C-u 5 M-> just jumps to the middle immediately. The problem > happens when the file is opened and I start scrolling with C-v. > With the new patch, things are still bad - emacs freezes almost > instantly. I've had a good look at your latest profile result. There doesn't seem to be any further untoward looping of low-level functions. So I'm not sure what more to fix, other than.... Have you got the option fast-but-imprecise-scrolling set (or customized) to non-nil? If not, could I suggest you try it. It's effect is to stop Emacs fontifying every screen it scrolls over, instead only fontifying screens when it's got no more input commands waiting. This speeds things up quite a bit on a slower machine. > I tested with 3 patches applied from messages 35, 95 and 128. > Here's the profile with emacs -Q : > https://gist.github.com/ravine-var/48b3e1469ac5a7f3c3df8d6d9313661a Thanks! There appear to be about 8 seconds worth of profile data there. How many screenfulls, approximately, did you actually scroll over in that time? Or, rather than answering that question, could I get you to try another timing test? Please put the following code into your *scratch* buffer (it's the same code I've posted before) and evaluate it: (defmacro time-it (&rest forms) "Time the running of a sequence of forms using `float-time'. Call like this: \"M-: (time-it (foo ...) (bar ...) ...)\"." `(let ((start (float-time))) ,@forms (- (float-time) start))) Then please load osprey_reg_map_macro.h freshly into a buffer, and type (or cut and paste) the following into M-: (time-it (let ((n 10)) (while (> n 0) (scroll-up) (sit-for 0) (setq n (1- n))))) What is the reported timing for scrolling these ten screens? Thanks! -- Alan Mackenzie (Nuremberg, Germany).
bug-gnu-emacs <at> gnu.org, bug-cc-mode <at> gnu.org
:bug#25706
; Package emacs,cc-mode
.
(Mon, 14 Dec 2020 07:35:02 GMT) Full text and rfc822 format available.Message #137 received at 25706 <at> debbugs.gnu.org (full text, mbox):
From: Ravine Var <ravine.var <at> gmail.com> To: Alan Mackenzie <acm <at> muc.de> Cc: Mattias Engdegård <mattiase <at> acm.org>, Lars Ingebrigtsen <larsi <at> gnus.org>, 25706 <at> debbugs.gnu.org Subject: Re: bug#25706: 26.0.50; Slow C file fontification Date: Mon, 14 Dec 2020 12:50:36 +0530
Alan Mackenzie <acm <at> muc.de> writes: > Have you got the option fast-but-imprecise-scrolling set (or customized) > to non-nil? If not, could I suggest you try it. It's effect is to stop > Emacs fontifying every screen it scrolls over, instead only fontifying > screens when it's got no more input commands waiting. This speeds > things up quite a bit on a slower machine. Turning on fast-but-imprecise-scrolling improves things by a lot. Viewing and scrolling the osprey file is much faster/smoother and the screen doesn't freeze. > Please put the following code into your *scratch* buffer (it's the same > code I've posted before) and evaluate it: > > (defmacro time-it (&rest forms) > "Time the running of a sequence of forms using `float-time'. > Call like this: \"M-: (time-it (foo ...) (bar ...) ...)\"." > `(let ((start (float-time))) > ,@forms > (- (float-time) start))) > > Then please load osprey_reg_map_macro.h freshly into a buffer, and type > (or cut and paste) the following into M-: > > (time-it (let ((n 10)) (while (> n 0) (scroll-up) (sit-for 0) (setq n (1- n))))) > > What is the reported timing for scrolling these ten screens? Running emacs -Q (master + 3 patches) : With fast-but-imprecise-scrolling: 0.9250097274780273 Without fast-but-imprecise-scrolling: 0.8903303146362305 I think using the fast-but-imprecise-scrolling option is a workaround that can be used in underpowered machines for big header files...
bug-gnu-emacs <at> gnu.org, bug-cc-mode <at> gnu.org
:bug#25706
; Package emacs,cc-mode
.
(Mon, 14 Dec 2020 11:45:01 GMT) Full text and rfc822 format available.Message #140 received at 25706 <at> debbugs.gnu.org (full text, mbox):
From: Alan Mackenzie <acm <at> muc.de> To: Ravine Var <ravine.var <at> gmail.com> Cc: Mattias Engdegård <mattiase <at> acm.org>, Lars Ingebrigtsen <larsi <at> gnus.org>, 25706 <at> debbugs.gnu.org Subject: Re: bug#25706: 26.0.50; Slow C file fontification Date: Mon, 14 Dec 2020 11:44:35 +0000
Hello, Ravine. On Mon, Dec 14, 2020 at 12:50:36 +0530, Ravine Var wrote: > Alan Mackenzie <acm <at> muc.de> writes: > > Have you got the option fast-but-imprecise-scrolling set (or customized) > > to non-nil? If not, could I suggest you try it. It's effect is to stop > > Emacs fontifying every screen it scrolls over, instead only fontifying > > screens when it's got no more input commands waiting. This speeds > > things up quite a bit on a slower machine. > Turning on fast-but-imprecise-scrolling improves things by a lot. > Viewing and scrolling the osprey file is much faster/smoother and the > screen doesn't freeze. :-) > > Please put the following code into your *scratch* buffer (it's the same > > code I've posted before) and evaluate it: > > (defmacro time-it (&rest forms) > > "Time the running of a sequence of forms using `float-time'. > > Call like this: \"M-: (time-it (foo ...) (bar ...) ...)\"." > > `(let ((start (float-time))) > > ,@forms > > (- (float-time) start))) > > Then please load osprey_reg_map_macro.h freshly into a buffer, and type > > (or cut and paste) the following into M-: > > (time-it (let ((n 10)) (while (> n 0) (scroll-up) (sit-for 0) (setq n (1- n))))) > > What is the reported timing for scrolling these ten screens? > Running emacs -Q (master + 3 patches) : > With fast-but-imprecise-scrolling: 0.9250097274780273 > Without fast-but-imprecise-scrolling: 0.8903303146362305 Thanks for doing that further testing. That's 0.09 seconds per scrolling of a screen. That is surely an acceptably low delay. > I think using the fast-but-imprecise-scrolling option > is a workaround that can be used in underpowered machines > for big header files... Or even in up to date full powered machines. ;-) I have it enabled all the time, and my PC is very similar to your faster one. So, I propose that these two patches (the big one and the smaller one for all the c-forward-syntactic-ws's) are sufficient to fix the bug, and I propose closing it now. What do you say to that? I have looked at the other problem you mention (slow scrolling through the machine-generated function proto_register_rrc in the wireshark file packet-rrc.c) and have made significant progress towards implementing a cache for the CC Mode function c-looking-at-or-maybe-in-bracelist, which should eliminate the long delays. Have you raised a new bug for this problem, yet? -- Alan Mackenzie (Nuremberg, Germany).
bug-gnu-emacs <at> gnu.org, bug-cc-mode <at> gnu.org
:bug#25706
; Package emacs,cc-mode
.
(Tue, 15 Dec 2020 04:06:02 GMT) Full text and rfc822 format available.Message #143 received at 25706 <at> debbugs.gnu.org (full text, mbox):
From: Ravine Var <ravine.var <at> gmail.com> To: Alan Mackenzie <acm <at> muc.de> Cc: Mattias Engdegård <mattiase <at> acm.org>, Lars Ingebrigtsen <larsi <at> gnus.org>, 25706 <at> debbugs.gnu.org Subject: Re: bug#25706: 26.0.50; Slow C file fontification Date: Tue, 15 Dec 2020 09:31:01 +0530
> So, I propose that these two patches (the big one and the smaller one for > all the c-forward-syntactic-ws's) are sufficient to fix the bug, and I > propose closing it now. What do you say to that? Works for me. Thanks for the patches. :-) > I have looked at the other problem you mention (slow scrolling through > the machine-generated function proto_register_rrc in the wireshark file > packet-rrc.c) and have made significant progress towards implementing a > cache for the CC Mode function c-looking-at-or-maybe-in-bracelist, which > should eliminate the long delays. Have you raised a new bug for this > problem, yet? https://debbugs.gnu.org/cgi/bugreport.cgi?bug=45248
Alan Mackenzie <acm <at> muc.de>
:Sujith <m.sujith <at> gmail.com>
:Message #148 received at 25706-done <at> debbugs.gnu.org (full text, mbox):
From: Alan Mackenzie <acm <at> muc.de> To: Ravine Var <ravine.var <at> gmail.com> Cc: Mattias Engdegård <mattiase <at> acm.org>, Lars Ingebrigtsen <larsi <at> gnus.org>, 25706-done <at> debbugs.gnu.org Subject: Re: bug#25706: 26.0.50; Slow C file fontification Date: Tue, 15 Dec 2020 12:27:45 +0000
Hello, Ravine. On Tue, Dec 15, 2020 at 09:31:01 +0530, Ravine Var wrote: > > So, I propose that these two patches (the big one and the smaller > > one for all the c-forward-syntactic-ws's) are sufficient to fix the > > bug, and I propose closing it now. What do you say to that? > Works for me. Thanks for the patches. :-) Thank you for all the testing! I've committed the changes to everywhere relevant, and I'm closing the bug with this post. > > I have looked at the other problem you mention (slow scrolling > > through the machine-generated function proto_register_rrc in the > > wireshark file packet-rrc.c) and have made significant progress > > towards implementing a cache for the CC Mode function > > c-looking-at-or-maybe-in-bracelist, which should eliminate the long > > delays. Have you raised a new bug for this problem, yet? > https://debbugs.gnu.org/cgi/bugreport.cgi?bug=45248 Thank you for this new bug report. I'll carry on trying to fix it. -- Alan Mackenzie (Nuremberg, Germany).
Debbugs Internal Request <help-debbugs <at> gnu.org>
to internal_control <at> debbugs.gnu.org
.
(Wed, 13 Jan 2021 12:24:04 GMT) Full text and rfc822 format available.
GNU bug tracking system
Copyright (C) 1999 Darren O. Benham,
1997,2003 nCipher Corporation Ltd,
1994-97 Ian Jackson.