r/excel

r/excel

A vibrant community of Excel enthusiasts. Get expert tips, ask questions, and share your love for all things Excel. Elevate your spreadsheet skills with us!

Members

Online

•

omarjibory

Extracting data from PDF to excel spreadsheet

unsolved

Hello everyone,

I'm looking for help to extract data from a big pdf file to excel spreadsheet.

The data I have in unstructured. I did watch a couple youtube videos and tried to use power query and I ended up having 1000+ tables. I'm not sure what I did wrong.

Would appreciate any help.

PS. I'm dumb when it comes to coding.

Sort by:

Top

Open comment sort options

Best

Top

New

Controversial

Old

Q&A

AutoModerator

•

Moderator Announcement Read More »

ExoWire

•

It really depends on how your Pdf looks like. Are there Tables inside? What can you select if you import it into PowerQuery?

omarjibory

•

No tables in the data. it's basically patients with certain diagnosis. Every time I try to do PowerQuery (combine and transform), I get groups of tables. 1 group each individual patient MRN and ID in one table then diagnosis in a separate one.

ExoWire

•

So there are groups of tables or only pages? You use Excel on Windows, don't you? What do you combine and transform? Are these multiple PDFs or one PDF with multiple data? Can you query the data before combining?

omarjibory

•

It's excel on windows

I watched a youtube video and this is what I did

I went to data ----> Get Data ----> from File ----> from folder and then I do combine and transform. This is where I get so many tables. I took a screenshot to show of it.

ExoWire

•

If the tables are not suitable for you, there should be pages at the bottom. Choose Parameter1 and in the next step filter out all tables and use the pages to continue.

More replies

tripleM98

•

I had to find a way to extract data from multiple unstructured PDFs to Excel.

To do this, I used Excel VBA and referenced the appropriate library to communicate with the PDF editor.

Then I used regular expressions to match for certain patterns in the PDF. After I matched for the string patterns, I would store the strings into an ArrayList object and transfer the data to Excel that way.

Btw, even though this is related to coding, might as well give it a shot if macros are allowed in your company. You might even learn something new and impress your colleagues if your program is stable / fool proofed enough.

optimoapps

•

Try https://bankstmtconverter.com where you can extract tables from PDF. Can process 100 pages in single PDF

omarjibory

•

Thank you, I will try it.

More replies

Pineapple_Playful

•

If you know exactly what you want to extract, just take each page separately and use an automation tool for data extraction like this.

omarjibory

•

I'm exploring this option. Thanks for the tip.

More replies

Gabo-0704

•

Could you share part of that doc to see if is there a possible solution?

omarjibory

•

Absolutely,

It's redacted for privacy but I want to collect all the elements listed in the reports.

You can see one page might have 1 report only while others have more than 1

Gabo-0704

•

• Edited

opening the file in power query does not seem to present problems with the structure of your doc, are you using multiple pdf files for each report? or how exactly did you open the file?

omarjibory

•

I watched a youtube video and this is what I did

I went to data ----> Get Data ----> from File ----> from folder and then I do combine and transform. This is where I get so many tables. I took a screenshot to show of it.

More replies

small_trunks

•

I wrote this for inspecting PDF files using PQ: https://www.dropbox.com/scl/fi/isec3htdw4206ck2naeje/PDFinfoV2.xlsx?rlkey=7bqcrmssm0a6bprwnh9x320jp&dl=1

More posts you may like

r/ProjectHospital

r/ProjectHospital

A place for fans of the hospital management game "Project Hospital" made by Oxymoron Games!

Members Online

HOW to get patients out of the observation room???

upvotes · comments
r/excel

r/excel

A vibrant community of Excel enthusiasts. Get expert tips, ask questions, and share your love for all things Excel. Elevate your spreadsheet skills with us!

Members Online

Basic Excel Skill I want to learn - pulling data from another spreadsheet based on a Key Column.

upvote · comments
r/excel

r/excel

A vibrant community of Excel enthusiasts. Get expert tips, ask questions, and share your love for all things Excel. Elevate your spreadsheet skills with us!

Members Online

How Do I Transfer PDF contents into Excel Spreadsheet

comments
r/Nr2003

r/Nr2003

A subreddit for fans of Papyrus Studio's NASCAR Racing 2003 Season simulation and all the community-created content available for it.

Members Online

Did a NCS22 Daytona race, got a lot of good moments from it

3

upvotes · comments
r/tabletopsimulator

r/tabletopsimulator

Tabletop Simulator is the only simulator where you can let your aggression out by flipping the table! There are no rules to follow: just you, a physics sandbox, and your friends. Make your own games and play how YOU want! Unlimited gaming possibilities!

Members Online

Use PDFs for game board iterations or when you have more maps

6

upvotes · comments
r/gif

r/gif

A place for funny, interesting, & animated SFW gifs and videos (mp4's.)

Members Online

As Requested: GIF extracted from the Hutchison and Stodden interview video

upvotes · comments
r/filemaker

r/filemaker

Members Online

Import Excel Spreadsheet to value list

upvote · comments
r/summonerswar

r/summonerswar

Community-run subreddit for the Com2uS game **Summoners War: Sky Arena**. Discuss the game with fellow summoners around the globe!

Members Online

SWOP (Rune Optimizer) 6.0.0

upvotes · comments
r/ChatGPT

r/ChatGPT

Subreddit to discuss about ChatGPT and AI. Not affiliated with OpenAI. Thanks Nat!

Members Online

Reading Excel/Spreadsheet tables

upvotes · comments
r/SideProject

r/SideProject

r/SideProject is a subreddit for sharing and receiving constructive feedback on side projects.

Members Online

Made an image to spreadsheet converter for data extraction

upvotes · comments
r/excel

r/excel

A vibrant community of Excel enthusiasts. Get expert tips, ask questions, and share your love for all things Excel. Elevate your spreadsheet skills with us!

Members Online

Printing Excel Spreadsheet, Filling in the Data by Hand then Uploading it back into Excel.

upvote · comments
r/excel

r/excel

A vibrant community of Excel enthusiasts. Get expert tips, ask questions, and share your love for all things Excel. Elevate your spreadsheet skills with us!

Members Online

Adding QR codes to spreadsheet

upvotes · comments
r/antiwork

r/antiwork

A subreddit for those who want to end work, are curious about ending work, want to get the most out of a work-free life, want more information on anti-work ideas and want personal help with their own jobs/work-related struggles.

Members Online

Bulk extraction of data from PDFs into excel tables using AI

comment
r/AppleMusic

r/AppleMusic

r/AppleMusic is the place to discuss Apple Music on Reddit!

Members Online

Copying songs into a spreadsheet

upvote · comments
r/ObsidianMD

r/ObsidianMD

Subreddit for the Obsidian notes app https://obsidian.md

Members Online

Import spreadsheet to dataview

comment
r/2007scape

r/2007scape

The community for Old School RuneScape discussion on Reddit. Join us for game discussions, tips and tricks, and all things OSRS! OSRS is the official legacy version of RuneScape, the largest free-to-play MMORPG.

Members Online

Looking for code to fetch item info into spreadsheet

upvote · comments
r/learnpython

r/learnpython

Subreddit for posting questions and asking for general advice about your python code.

Members Online

Entering Data from Spreadsheet by Row to another Spreadsheet and Printing

upvotes · comments