Conversation
matthiasblaesing
commented
Feb 14, 2026
- Additional package lucene-analysis-common required (KeywordAnalyzer, WhitespaceAnalyzer, PerFieldAnalyzerWrapper, CharTokenizer, LimitTokenCountAnalyzer)
- FieldTypes were introduced to carry the field behavior (tokenization state, indexing options, storage settings)
- BooleanQuery construction moved to builder
- Collector interface was modified
- TermEnum was replaced by TermsEnum
- Terms are stored per field, so queries have to be specified with the target field
- QuerySelectors were replaced by String sets
- RAMDirecory was removed and is replaced by ByteBufferDirectories
- Lock handling was reworked
44d27c4 to
0c1baaa
Compare
|
@matthiasblaesing This looks great! Thank you! As far as I've tested. this generrally works, however there is a bug in the |
|
Also is it possible to upgrade the base JDK version to Java 21 for |
I had an initial branch that did that (and bumped to lucene 10), but that will cause issues in the enterprise cluster, that runs with JDK17 for tests. @mbien gave the hint that lucene 9 allows to stay with the older java level.
I'm seeing issues with LuceneIndex. There is a strange construction in |
|
I think I can solve the issue of Java 21 on Enterprise Tests. What I see the majority of the fails comes from the Micronaut integration. I had some clash with that lately. I also see that my PR broke your paperwork as the sig file needs to be re-generated for |
|
The micronaut problems are real. Real as in, I manually tested the case (started IDE with example project) and found completion broken. I'll look a bit further. |
|
@matthiasblaesing please rebase on master, @mbien fixed Micronaut tests with Java 21. |
b9cb304 to
4388561
Compare
|
The commits are rebased, but that does not seem to be a problem. I can reproduce the micronaut problem locally. The problem is not deterministic. I tested it by:
I noticed changes in the packages there were not listed anymore. I then located the index folders for the jars and indeed the indices were at least partially broken. That matches outputs in the unittests reporting: |
|
I found at least one problem with the updated code: the locks move from per-factory to global and that explains the seemingly random behavior. |
|
And I found the second round of problems:
|
20f6769 to
9a2e09d
Compare
|
I've done some testing today. The old
Other probably biassed (not measured in any way) experience: The indexing seems to be a bit slower than before, though the index search seems to be faster. Never had a completion when it was not feeling instant. Will test this on Raspberry Pi. |
lkishalmi
left a comment
There was a problem hiding this comment.
I'm on the edge on getting this merged. Even when there are known issues, merging it allows more people to test/give feedback chance to debug.
9a2e09d to
b9d392c
Compare
Yeah. I noticed that when I switch between 10 and 9 for testing. I'll see if I can up with a method to allow a smother up/downgrade (most probably this will mean: nuke the exiting index).
Found the issue. Somewhere between multiple iterations I move to a try-with-resource setup and this was a remaining not reverted change. The readers in MemoryIndex and LuceneIndex are shared and must not be closed at the use-sites. Should be fixed now. |
There was a problem hiding this comment.
Updates look good to me and some parts are also similar to some of my update attempts which I never finished since something else always seemed to show up.
I ran some smoke testing scripts which mostly test initial index creation and everything worked well. The index size and indexing times were fairly similar to NB 29 which I somewhat expected since I think to remember that the bottle neck wasn't the lucene end last time I looked (edit: index creation is also only a small piece of the puzzle, queries would be interesting to benchmark). Didn't see any exceptions and everything seemed to work after indexing - great job!
Yeah. I noticed that when I switch between 10 and 9 for testing. I'll see if I can up with a method to allow a smother up/downgrade (most probably this will mean: nuke the exiting index).
technically we don't have to worry about this too much since the NB upgrade process does not copy the index. There is also the option for adding the lucene backward-codecs dependency - this was done for the maven indexer which does actually move the index if its fresh enough.
This allows to use old indices for a bit longer before they have to be reset. (this does not cover lucene 3 but it is something we might consider when we update lucene the next time)
left one comment inline
ide/parsing.lucene/src/org/netbeans/modules/parsing/lucene/RecordOwnerLockFactory.java
Outdated
Show resolved
Hide resolved
029628e to
d752c8f
Compare
|
Pushed an update:
For comparison see: |
|
Ran the smoke test script again (opens and indexes ~850 NB modules) and everything worked fine without exceptions or indexer related warnings. Editor features worked after that too. Seems to be solid! |
- Additional package lucene-analysis-common required (KeywordAnalyzer, WhitespaceAnalyzer, PerFieldAnalyzerWrapper, CharTokenizer, LimitTokenCountAnalyzer) - FieldTypes were introduced to carry the field behavior (tokenization state, indexing options, storage settings) - BooleanQuery construction moved to builder - Collector interface was modified - TermEnum was replaced by TermsEnum - Terms are stored per field, so queries have to be specified with the target field - QuerySelectors were replaced by String sets - RAMDirecory was removed and is replaced by ByteBufferDirectories - Lock handling was reworked - Remove StoppableConvertor from Lucene Support - Work around potentially excessivily large terms in index documents. (observed was a single `usage` entry for a JS document with about 40_000 characters) Co-authored-by: Laszlo Kishalmi <laszlo.kishalmi@gmail.com> Co-authored-by: Michael Bien <mbien42@gmail.com>
d752c8f to
8690430
Compare