Sunday, February 21, 2016

Converting a Scanned PDF to a text file

I have done a little fiddling and seem to be able to convert a PDF that is a scan of a book, to text format here is the approach in Ubuntu.

1. Put your .pdf in a folder, navigate to that folder in the command line.
2. Type the following:
pdftoppm -png [filename].pdf [prefix]
3. Next install gocr
sudo apt-get install gocr
4. Finally this command:
for i in *.png; do gocr -i $i -o $i.txt; done

You'll have a big list of .txt files.

Now you can concatenate all the files.

cat *.txt >> [new_file].txt

I won't claim that the text files are pretty at all, but you can take them and start to massage them so that you end up with a nice text file you can then use a speed reading app with.

If you build something that cleans these up, share it below.

Alternately if the PDF is not an image but a real PDF Text file, then a simple pdftotext command should work.

Friday, February 19, 2016

Working with Bitbucket and Transferring Ownership for Class Coding Projects

I recently started using BitBucket due to their handy free private repositories feature.

I started out with a repo for class but I named it the class name, now I need to create another project, but I can't with the same primary name, and it doesn't make a lot of sense to put all the files under the same repo (the repo won't get huge or anything, but it's an organization issue). I found out there was a solution called Transfer Ownership (or transfer repository)

Teams to the rescue.

I created a team for my class, Then I created a new repo. But wait... that isn't how you should do it!

Create a Team (whatever name you want to give it)

 

NEXT create a Project (logically it would be your classes name). So I'll give an example structure below.





* Be Sure to select Private


Next you can take your existing repos and transfer ownership to that KBAI project (or whatever you call yours).

1. Go to your repo
2. Hit Settings.
3. Then Transfer Repository
4. Select the team you created above.






5. Now you select the project.



Or you can create a new repo, just point to the Project and everything for your "class" will fall under this project. Clones will be smaller as you won't clone all of them.




Now as you can see here are the repos grouped all nice and stuff.





 Now if you click on Cognitive Cooking it's this is the clone path
git clone https://onaclov2000@bitbucket.org/onaclovtech/cognitivecooking.git

(Which is wrong for all intents and purposes, but pretending it's right, and I'll post a followup regarding this).

You can see that the git repo is associated with the team, but the structure in bitbucket is that the repo is associated with a project, which allows some organization, but you also don't have to download one repo for your entire class or program or whatever, you can just have the repos grouped.

You'll want to make sure to update your origin/master since they're probably pointing to the wrong place (if you're transferring repositories).

Good luck!