tracker issue : CF-3041780

select a category, or use search below
(searches all categories and all time range)
Title:

Bug 83599:I'm not sure if this is a regression bug with more log info, or a new one that looks like an old one

| View in Tracker

Status/Resolution/Reason: Closed/Fixed/

Reporter/Name(from Bugbase): Tomas Fjetland / Tomas Fjetland (TomasFjetland)

Created: 07/15/2010

Components: Text Search, Solr

Versions: 9.0.1

Failure Type: Unspecified

Found In Build/Fixed In Build: 274733 / 275414

Priority/Frequency: Major / Unknown

Locale/System: English / Win All

Vote Count: 0

Problem:

I'm not sure if this is a regression bug with more log info, or a new one that looks like an old one. I'm indexing a collection of 52 000 text files of less than 10KB each. After installing the 9.0.1 update, it never finishes indexing the collection, but dies with the error quoted. The file referred to is a simple indexing job that is called from Scheduled Tasks. It does submit about 23 000 files into the collection before dying.
Method:

Set up a collection "Fileinfo".run an index job against a unc path like<cfindex collection="Fileinfo" action="refresh" extensions=".nfo" key="\\nas\researchdata\metadata" type="path" urlpath="\\nas\researchdata\metadata" recurse="yes" language="english" status="metastat">where there are 50 000+ small text files with metadata using a .nfo extensionIt stops halfway with the error listed. Along the way it might also log errors for individual files in the server.log. the logging is new, but it might not have been able to index them before either:"Warning","jrpp-26","07/15/10","16:39:53",,"WARNING: Could not index \\nas\researchdata\metadata\project.591\early.drafts\index.overview.nfo in SOLR. Check the exception for more details: Text encoding could not be detected and no encoding hint is available in document metadata"
Result:

"Error","jrpp-26","07/15/10","16:39:53",,"org/ccil/cowan/tagsoup/Parser The specific sequence of files included or processed is: C:\Inetpub\wwwroot\Verity\indexer_weekly.cfm, line: 28 "java.lang.NoClassDefFoundError: org/ccil/cowan/tagsoup/Parser

----------------------------- Additional Watson Details -----------------------------

Watson Bug ID:	3041780

External Customer Info:
External Company:  
External Customer Name: Tomas Fjetland
External Customer Email: 599872CD4866EA489920154A
External Test Config: 07/15/2010

Attachments:

Comments: