...
Java Wiktionary Library (JWKTL) is an open source project aimed to ease the parsing of Wiktionary data. Check out their site here: https://dkpro.github.io/dkpro-jwktl/
Currently JWKTL only supports English, German and Russian languages if you download the source from their siteby default.
At Accenture, we added Spanish support and you can get the source code from Git: https://source.digital.accenture.com/projects/ST/repos/saga-jwktl/browse
For Spanish, we focused on the bare minimum SAGA needs to work. If you want to do the same then , it may be a good idea to base your new language on the Spanish parser (copy, paste and rename files). If you want to implement a more complete version of the parser then the English parser is a better option.
Following image shows the structure of the JWKTL project. Notice there is a folder for each language. You'll need to add a new folder for your desired language.
Info |
---|
Handlers are registered in the WiktionaryEntryParser for each language. The registration order is important, for example SenseHandler needs to be the last one. The recommendation is to follow the same order defined by the English parser for the handlers you are implementing. |
In addition to adding a new folder and handlers for your new language you need to add the following changes:
2. Add your new language parser instantiation in the onSiteInfoComplete method in the class: src/main/java/de/tudarmstadt/ukp/jwktl/parser/WiktionaryArticleParser.java
then based on another language, add the supporting classes accordingly.
...