Talend Exchange is the place where Talend community can share items related to Talend opensource products, such as Data Integration, Data Quality and Data Master Management. Contribution is open to any user, no specific validation is needed. As soon as you have your forum account, you automatically get a Talend Exchange account.
About: tTikaExtractor use Apache TIKA parser to easily extract information from many different formats like (html, pdf, doc, odt, image, audio, video, ...). See http://tika.apache.org/1.0/formats.html for more information about available parsers.
Revision 0.1 249 Downloads, Released on 2012-01-25
Compatible with: 5.4.0, 5.3.0, 5.2.3, 5.0.0
Very usefull indeed..
But Apache tika project is actually versionned to 1.5. Current ttika extractor is using the 1.0.
I upgraded it manually...