Description of problem: Indexes generated using the latest cauldron x86_64 build of swish-e are likely to contain corrupt data. Searching these indexes is prone to segfaults or failure using swish-e directly or the perl-SWISH-API. Original upstream bug report: http://swish-e.org/archive/2013-10/13144.html More informative upstream bug report: http://swish-e.org/archive/2013-11/13148.html Version-Release number of selected component (if applicable): 2.4.7-11.mga4.x86_64 How reproducible: Almost always, but results vary. Failures are either segfaults or return an error. Steps to Reproduce: 1. Create an index of a number of large documents. I chose random gtk-doc HTML document folders: swish-e -f ~/index.test.swish-e -i /usr/share/gtk-doc/html/gobject 2. Search for a common keyword for lots of hits: swish-e -f ~/index.test.swish-e -w "gobject" I also tried cairo and glib and the results were either an immediate segfault or a truncated list of matching files, ending with an error similar too: err: Failed to seek to properties located at 2738810656504414208 for file number 230 : Invalid argument Reproducible: Steps to Reproduce:
Created attachment 4577 [details] Proposed patch There hasn't been a new release in 4 years, but the project is in, however slight, active development and I make good use of it. Rather than sulk, I went bug hunting. I found fixes to 2 other minor bugs and the solution to this one. At first sight, this looked like a fix: http://dev.swish-e.org/ticket/14 This failure is caused by using memcpy on overlapping memory areas in remove_worddata_longs. It's been there for years and just now failed. Changing memcpy to memmove fixed it. I've attached a patch that fixes both of these bugs and silences this annoying warning: err: Parsing of undecoded UTF-8 will give garbage when decoding entities at /usr/lib/swish-e/swishspider line 97 I've tested the patch here and all is good again :)
Keywords: (none) => PATCH, Triaged, UPSTREAMCC: (none) => jani.valimaaAssignee: bugsquad => thomas
Thanks a lot for the Error Report and even more thanks for the Patch. Would you please test it. (This patch makes it build again, passing the test)
Status: NEW => ASSIGNED
No more error reports and it builds now. I consider it as fixed
Status: ASSIGNED => RESOLVEDResolution: (none) => FIXED