Week 3
Milestones
Finishing POC for editing translation dictionary, wrapping up
- Search feature
- Investigate the translation more thoroughly, find out why the BLEU score is so low
- Work on prompting, find better method
- Bug with "Add" button on editing an existing row
- Box for displaying sentences / including-excluding the tests
- Allow manual entry of test translations into benchmark
- No need to change schema, "Hack" a new proxy word to store all the manually added test translations
- Add comments
- Document setup and instructions for use
PDF Parsing tasks
- Create a ticket for defining use case completely
- Plan out a small UI for defining of text boundaries, and carry out OCR within that boundary. Allow user to also define these boundaries.
Screenshots / Videos
- Made a GitHub Repository to store the POC - https://github.com/shrivastava95/poc_ocr
- POC demo with basic OCR parsing