Your browser does not support JavaScript. Please to enable it.

Terms & Conditions

The idea you wish to view belongs to a community that requires acceptance of terms and conditions.

RejectAccept

    Help to Improve This Idea.

    What's New

    Search

     
    Prev | Next

    Table Recognition and Extraction

    by Tian Qing 02/28/2018 04:42 PM GMT

    • {{:upVoteCount}}
    • {{:downVoteCount}}
    Username * ()

      I accept the terms and conditions (see side bar). I understand all content I am submitting must be licensed under an open-source software or Creative Commons license as described in the Terms and Conditions:

      on

      Description

      The outcome of this task will be extracting table from the PDFs of Yearbook and structuring the tables into readable data set (csv) for Government, researchers and others to use in the future.

      Initial Phase: 

      1. Downloading the PDFs.

      2. Using packages in R / Python to Extract table from PDFs.

      3. Transforming the tables into the data frame.

      Cleaning Phase:

      1. Removing and modifying rows or columns which contain incorrect value

      2. Identifying each table by its page No. and name of PDF.

      3. Normalizing similar tables into same structure to avoid redundancy if the data would be stored in Database in UN.

       

      Co-authors to your solution

      Gokulakrishnan Narasimhan, Guilherme Silveira, Meijie Li, Guangyue Li

      Link to your concept design and documentation (Required by the final day of the Submission & Collaboration phase)

      Link to an online working solution or prototype (Required by the final day of the Submission & Collaboration phase):

      Link to a video or screencast of your solution or prototype (Required by the final day of the Submission & Collaboration phase):

      Link to source code of your solution or prototype above. (If you submitted a link to an online solution or prototype, or to a video of your solution of prototype, you must provide a link to the source code. This item is required by the final day of the submission phase):

      Initial idea,#StatsHistory

      Attachments

        Help to Improve This Idea.

        0%
        33%
        0%
        User Tasks ?
        Required for graduation.
        Task Assigned to Due Date Status
        Approval 06/15/2018 Completed
        on 05/04/2018
        No ideas found!

        Request to become a member

        Type a short message to the owner of this idea.:

        Invite Team Members

          Message
          *Required