Live notes: images and AzuleJoe

This is a live notes post. It is of very low interest to almost all readers, but I do believe that the more open my work is, the better that work goes.   These are posts mostly written for me, but if you arrived here from a search engine and it looks like I once had a problem that you have now then feel free to drop me a line and I’ll put things into order a little bit.

#28/03/2016-19:16:43f GMT+1:

I also have a sneaking suspicion that there is a better way of doing this job and I’m trying to write this post in such a way that I can easily ask sensible questions of Stack exchange.

 

Today I’m working on AzuleJoe.   The code that extracts the images from PowerPoint works very well.  However, there are problems when the symbols are made up of several other symbols or are actually single characters as text (for example: ‘?’).

 

Today I want to go through one of the input files so that the images come through much more nicely when it goes through this process.

 

#28/03/2016-19:38:55 GMT+1:

I’ve realised that I can only ask for help with this once I’ve put up a process shapshot on equalitytime.co.uk.

 

#28/03/2016-19:50:36 GMT+1:

Snapshot post added.

I’d like to avoid checking each symbol individually so one of the branches of the AzuleJoe code is setup to produce warnings when a image is missing.  But now I need to find it.

#28/03/2016-19:58:34 GMT+1:

Hmm, I thought it was in that branch. Let’s try gitX.

#28/03/2016-20:02:34 GMT+1:

hmm, it appears that there is no trace of such a branch. I’d better create one.  New branch is called pictureWarnings. It will be needed later on so that we users can be warned if something is wrong with what they uploaded.

#28/03/2016-20:13:40 GMT+1:

When looking for a link for these notes, I’ve found a problem with the communikate site. Hang on.

#28/03/2016-20:16:36 GMT+1:

Fixed.

I’m using the recently released CK12 for this exercise, mostly because CK20 has it’s own problem’s with AzuleJoe right now. The file I’m using (and hopefully updating during this process) is this one.

#28/03/2016-20:19:45 GMT+1:

I’ve set

print_exceptions = True

in grab_text and used

./create.sh CK12+.pptx

 

to run a test.  It produced a lot of exceptions I wasn’t expecting, I’ll make a note to look at that later, but the code worked fine.  The problem I’m attacking is clear. This is the main page: Screen Shot 2016-03-28 at 20.24.25and I want to be alerted when there is a missing icon.

 

#28/03/2016-20:30:29 GMT+1:

Alert readers will notice that I just set print exceptions to True and then was surprised about ass the exceptions being printed. Sigh.

 

#28/03/2016-20:31:28 GMT+1:

Working out how to detect missing images.

#28/03/2016-20:38:00 GMT+1:

Learned the interesting fact that, in python, a dictionary can be indexed by a tuple. Obvious in hindsight, but there you go.

#28/03/2016-20:47:29 GMT+1:

There is now a very basic way of detecting when something has a label without an image but – wait that’s actually wrong.  Hang on.

 

#28/03/2016-20:55:56 GMT+1:

Okay, the proof of concept is now semi working and produces the correct text for the first four slides, but is running into our old friend UnicodeEncodeError.

#28/03/2016-20:58:43 GMT+1:

Just used the comment above as the commit message, that will be useful. In future I should add a link from the commit message to the relevant live log.

#28/03/2016-21:04:19 GMT+1:

Hmm fixing a (I assumed) unrelated bug about printing an alert for blank cells, appears to have fixed the UnicodeEncodeError. I’ll take it.

#28/03/2016-21:08:24 GMT+1:

Commited and this is the result. missingImages.

 

#28/03/2016-21:20:33 GMT+1:

Some quick work has located and solved one of the existing problems.  Some images were simply too big for AzuleJoe to recognise them.  I’ve fixed a few while checking.  This is the screencast for those sorts of errors.

 

 

 

#28/03/2016-21:27:47 GMT+1:  Commit on joereddington.com/5743/2016/03/28/live-notes/

A slide design flaw meant that the code was complaining that a page’s title didn’t have a label.  Fixed now.

 

#28/03/2016-21:36:16 GMT+1:

Some images are missing in the code because they have been put together as shapes in the PowerPoint. It’s easily fixed. Here’s a relevant YouTube clip:

 

 

#28/03/2016-21:38:42 GMT+1:

Now let’s look at a few more errors:

 

WARNING: image missing at column 0, row  2 (label: Coke) on slide:Fizzy drink:

This was text in the form of a symbol. Fixed in the same way as a shape in the second YouTube video.
WARNING: image missing at column 1, row  2 (label: car) on slide:transport

Needed a crop.
WARNING: image missing at column 2, row  3 (label: lorry) on slide:transport

Needed a crop.

 

WARNING: image missing at column 1, row  1 (label: Weather) on slide:Nature

Shapes that needed converting to a picture.
WARNING: image missing at column 2, row  1 (label: Animals) on slide:Nature

Shapes that needed converting to a picture. Cropped as well.
WARNING: image missing at column 2, row  2 (label: Fast food) on slide:Nature

The image was fine – the was a hidden label lurking behind a blank square.
WARNING: image missing at column 3, row  1 (label: gardening) on slide:Nature

Shapes that needed converting to a picture.
WARNING: image missing at column 3, row  2 (label: foggy) on slide:Weather

Shapes that needed converting to a picture.

 

#28/03/2016-21:47:53 GMT+1:

I’ve rerun the code after the above errors were corrected and they have all been dealt with successfully.  Just another 43 to go. But that’s just on this one file. the CK20 version will be much more of a problem. So let’s find out if there is a faster way.

#28/03/2016-22:08:50 GMT+1:

I’ve written two super-user questions. One for the croping, and another for the combining images.   I’ll give them a day to see if there are better ways to do the job.

 

In the meantime, I want to tidy the code a little.

 

#28/03/2016-22:14:39 GMT+1:
Commit on joereddington.com/5743/2016/03/28/live-notes/

I’ve added a switch IMAGE_WARNING to make the alerts optional for now.

 

#28/03/2016-22:18:32 GMT+1:

I merged the branch with master and pushed to repo, you can view it here.

New Day

#29/03/2016-03:46:06 GMT+1:

both SE questions are still waiting for answers. I’ve added more information to one in the hoe it encourages responses.

 

#07/04/2016-09:43:45 GMT+1:

…and nothing.  Never mind, we’ll just have to do it properly.

 

#07/04/2016-09:46:21 GMT+1:

Okay, in master and have switched on image warnings.

 

This is the full set of remaining image warnings:

WARNING: image missing at column 3, row  3 (label: materials) on slide:Pet care

The text turned out to be orphaned text that was behind the slide.  Deleted
WARNING: image missing at column 1, row  2 (label: look) on slide:Action words
WARNING: image missing at column 2, row  1 (label: Friends) on slide:friends
WARNING: image missing at column 3, row  1 (label: People who help) on slide:friends
WARNING: image missing at column 3, row  2 (label: jobs) on slide:friends
WARNING: image missing at column 2, row  1 (label: Friends) on slide:People who help
WARNING: image missing at column 3, row  2 (label: jobs) on slide:People who help
WARNING: image missing at column 2, row  1 (label: Friends) on slide:jobs
WARNING: image missing at column 3, row  1 (label: People who help) on slide:jobs
WARNING: image missing at column 0, row  2 (label: fast) on slide:opposites
WARNING: image missing at column 0, row  2 (label: 3) on slide:numbers
WARNING: image missing at column 1, row  1 (label: 123) on slide:numbers
WARNING: image missing at column 1, row  2 (label: 4) on slide:numbers
WARNING: image missing at column 2, row  1 (label: 1) on slide:numbers
WARNING: image missing at column 2, row  2 (label: 5) on slide:numbers
WARNING: image missing at column 3, row  1 (label: 2) on slide:numbers
WARNING: image missing at column 1, row  1 (label: numbers) on slide:My day
WARNING: image missing at column 2, row  2 (label: Fast food) on slide:My day
WARNING: image missing at column 0, row  2 (label: bread) on slide:shops
WARNING: image missing at column 0, row  3 (label: market) on slide:shops
WARNING: image missing at column 1, row  2 (label: veg shop) on slide:shops
WARNING: image missing at column 1, row  3 (label: cafe) on slide:shops
WARNING: image missing at column 2, row  1 (label: supermarket) on slide:shops
WARNING: image missing at column 2, row  2 (label: shopping centre) on slide:shops
WARNING: image missing at column 2, row  3 (label: music shop) on slide:shops
WARNING: image missing at column 3, row  1 (label: butcher) on slide:shops
WARNING: image missing at column 3, row  2 (label: clothes shop) on slide:shops
WARNING: image missing at column 3, row  3 (label: pharmacy) on slide:shops
WARNING: image missing at column 1, row  3 (label: lake) on slide:Outside places
WARNING: image missing at column 3, row  1 (label: hall) on slide:Places at home
WARNING: image missing at column 0, row  3 (label: again) on slide:time
WARNING: image missing at column 3, row  3 (label: week) on slide:day
WARNING: image missing at column 1, row  3 (label: rewind) on slide:dvd
WARNING: image missing at column 2, row  3 (label: Play/pause) on slide:dvd
WARNING: image missing at column 3, row  2 (label: medication) on slide:dvd
WARNING: image missing at column 3, row  3 (label: Fast forward) on slide:dvd
WARNING: image missing at column 1, row  3 (label: rewind) on slide:mp3
WARNING: image missing at column 2, row  3 (label: Play/pause) on slide:mp3
WARNING: image missing at column 3, row  2 (label: medication) on slide:mp3
WARNING: image missing at column 3, row  3 (label: Fast forward) on slide:mp3
WARNING: image missing at column 0, row  2 (label: here) on slide:Little words
WARNING: image missing at column 0, row  3 (label: there) on slide:Little words
WARNING: image missing at column 1, row  3 (label: &) on slide:Little words

 

I’ve highlighted the ones in red – the numbers require me to fix some other aspects first… hmmm. Actually they don’t…

#07/04/2016-10:19:40 GMT+1:

We are now at this point

Mac-34363b77a5de:azulejoe josephreddington$ ./create.sh CK12+.pptx
mkdir: CK12+.pptx: File exists
WARNING: image missing at column 2, row  1 (label: Friends) on slide:friends
WARNING: image missing at column 3, row  1 (label: People who help) on slide:friends
WARNING: image missing at column 2, row  1 (label: Friends) on slide:jobs
WARNING: image missing at column 3, row  1 (label: People who help) on slide:jobs
WARNING: image missing at column 0, row  2 (label: 3) on slide:numbers
WARNING: image missing at column 1, row  1 (label: 123) on slide:numbers
WARNING: image missing at column 1, row  2 (label: 4) on slide:numbers
WARNING: image missing at column 2, row  1 (label: 1) on slide:numbers
WARNING: image missing at column 2, row  2 (label: 5) on slide:numbers
WARNING: image missing at column 3, row  1 (label: 2) on slide:numbers
WARNING: image missing at column 1, row  3 (label: rewind) on slide:dvd
WARNING: image missing at column 2, row  3 (label: Play/pause) on slide:dvd
WARNING: image missing at column 3, row  2 (label: medication) on slide:dvd
WARNING: image missing at column 3, row  3 (label: Fast forward) on slide:dvd
WARNING: image missing at column 1, row  3 (label: rewind) on slide:mp3
WARNING: image missing at column 2, row  3 (label: Play/pause) on slide:mp3
WARNING: image missing at column 3, row  2 (label: medication) on slide:mp3
WARNING: image missing at column 3, row  3 (label: Fast forward) on slide:mp3
WARNING: image missing at column 0, row  2 (label: here) on slide:Little words
WARNING: image missing at column 0, row  3 (label: there) on slide:Little words
WARNING: image missing at column 1, row  3 (label: &) on slide:Little words
Mac-34363b77a5de:azulejoe josephreddington$

I’m going to record a quick screencast for the layering issue, particularly now we have only one left…

#07/04/2016-10:25:37 GMT+1:

Brief pause to sort out batteries in mouse.

#07/04/2016-10:39:38 GMT+1:

screencast done and uploading.

 

 

#07/04/2016-10:55:02 GMT+1:

Whoop!

 

Mac-34363b77a5de:azulejoe josephreddington$ ./create.sh CK12+.pptx
mkdir: CK12+.pptx: File exists
Mac-34363b77a5de:azulejoe josephreddington$

Unfortunately, this is only one of the files. Now to produce the same list for the larger size.

 

#07/04/2016-10:58:21 GMT+1:

Turns out to be none trivial, branching…

checkpicture20 is the branchname.

 

#07/04/2016-11:06:55 GMT+1:

Ah-ha! Now running – just generating the file. output

#19/04/2016-10:45:18 GMT+1:

Have come back to this as a light introduction before I fix a much more serious bug.

 

#19/04/2016-10:45:51 GMT+1:

I have a corrected CK20 file provided by someone else. Let’s check it.   Uploading now.

 

#19/04/2016-10:48:29 GMT+1:

Seems ligit, but there are some oddities. I’ll run it thorught the scanner again. Where do I find that?

 

#19/04/2016-10:50:09 GMT+1:

There’s been quite a lot of development in another notes branch.  So I might be starting a little behind.

 

Switched to a new branch ‘checkimages

 

#19/04/2016-10:51:14 GMT+1:

Switched image warnings on.

 

I’ve moved the file ‘CK20processLinkcorrceted.pptx’ into ‘uploads’ as a test file.  Running…

MacBook-Air:azulejoe josephreddington$ python grab_text.py CK20processLinkcorrceted.pptx 5
Warning, shape outside of page area on page:6
WARNING: image missing at column 0, row  4 (label: C) on slide:My Stories
WARNING: image missing at column 4, row  4 (label: C) on slide:My Stories
WARNING: image missing at column 4, row  2 (label: ) on slide:Fizzy drinks
WARNING: image missing at column 4, row  3 (label: ) on slide:Fizzy drinks
WARNING: image missing at column 0, row  2 (label: ) on slide:milkshake
WARNING: image missing at column 4, row  2 (label: ) on slide:milkshake
WARNING: image missing at column 4, row  3 (label: ) on slide:Fast food
WARNING: image missing at column 4, row  3 (label: ) on slide:lunch
unknownSlide 23 link here: [4] [3]
unknownSlide 23 link here: [3] [2]
unknownSlide 23 link here: [2] [2]
unknownSlide 23 link here: [3] [1]
WARNING: image missing at column 4, row  2 (label: ) on slide:alcohol
WARNING: image missing at column 4, row  3 (label: ) on slide:alcohol
WARNING: image missing at column 4, row  3 (label: ) on slide:juice
WARNING: image missing at column 0, row  2 (label: ) on slide:friends
WARNING: image missing at column 0, row  2 (label: ) on slide:scwfriends
WARNING: image missing at column 0, row  2 (label: ) on slide:Support staff
WARNING: image missing at column 0, row  2 (label: ) on slide:Education staff
unknownSlide 37 link here: [3] [3]
unknownSlide 37 link here: [3] [1]
Traceback (most recent call last):
File “grab_text.py”, line 307, in <module>
export_images(grids[slide_number], slide)
File “grab_text.py”, line 211, in export_images
print “WARNING: image missing at column {}, row  {} (label: {}) on slide:{}”.format(col, row, labels[col][row], grid.tag)
UnicodeEncodeError: ‘ascii’ codec can’t encode character u’\u2019′ in position 5: ordinal not in range(128)

 

This is certainly shorter than the previous file, but that might be because of the ascii exception…

 

#19/04/2016-11:10:09 GMT+1:

I am lost in unicode hell right now and I’m starting to believe that I should be moving the code over to python three shortly…

 

#19/04/2016-11:12:12 GMT+1:

Ah-ha! currently working – now let me take out some of the things that might now be helping…

 

#19/04/2016-11:14:26 GMT+1:

Have commited the tiny change. Now to see where we are.

 

Okay, the previous file had 275 missing image alerts. Let’s look at this one.   Ah – 77 excellent.  But I suspect there are a few things to look at.

 

#19/04/2016-11:23:15 GMT+1:

 

Okay – let’s reupload and see what’s changed.

 

#19/04/2016-11:28:44 GMT+1:

Side note – I’m experimenting with allowing multiple uploads ‘over’ each other.  We’ll see how that works…

 

#19/04/2016-11:30:04 GMT+1:

Okay, so there is a problem where I have only specifed one dimention of scaling. Better fix that.

 

#19/04/2016-11:32:25 GMT+1:

There’s another problem with icons simply being out of date. That’s going to take a while… (‘things’ is a prime example).

 

#19/04/2016-11:36:38 GMT+1:

Ah-ha! They aren’t out of date – they are just being replaced by later version… I should alter the code for this – I like the idea of each icon being named the same thing (it saves space as well) but homophones will break it, as will just more exact terms.

 

Sigh.

 

#19/04/2016-11:38:40 GMT+1:

Okay’ haven’t got time for this change right now, but l’ll add it to the list.   Once I’ve fixed that, we’ll go back and look at the remaining issues.. (I wonder if I can use the layering to ignore -text-behind-text-?)

 

#11/05/2016-18:09:22 GMT+1:

Fixing the remaining issues right now.

 

#11/05/2016-18:15:48 GMT+1:

The only thing I have a problem with is that the ‘food’ picture isn’t showing up….

 

#11/05/2016-18:17:23 GMT+1:

I replaced the image and it looks good.

 

#20/05/2016-10:57:12 GMT+1:

So I’m coming back to this one, for the fairly sensible reason that I accdiently delteted my copy of the CK12 file I’d fixed.  Someone else has fixed up a version for me and sent it over, I’m looking at it now.

 

#20/05/2016-10:59:41 GMT+1:

Hmm, the images appear great (I’m going to do a proper check shortly, but I’m having a problem with this:

 

 

Screen Shot 2016-05-20 at 11.00.06

Clicking on sections with NO image causes a problem…

#20/05/2016-11:01:31 GMT+1:

But such problems N’T appear here….

Screen Shot 2016-05-20 at 11.01.20

Ah – the difference appears to be that these are set as folders rather than squares… sigh.

 

#20/05/2016-11:19:46 GMT+1:

Think I’ve dealt with this now (over several iterations). Now I want to make sure that all warnings are on.

 

#20/05/2016-11:26:08 GMT+1:

Okay, so I’ve found a link that isn’t linked,. Worse I’m slightly unsure were it should go.  I’ve emailed the designer.

Nethertheless, it appears that this version of CK12 is pretty solid, and I can do a release.

 

While I’m waiting for a reply, I’ll have a look at CK20.

 

#20/05/2016-11:30:24 GMT+1:

CK20master has the same problem that I just fixed in CK12 and has a few other warnings.  I’ll have to switch exception view back off for a bit thought..

#20/05/2016-11:44:24 GMT+1:

Damnit, going thought the CK20master file has made it clear to me that I need to have the “special::” code working before I put out the v2 of CK20.  Which is a big pain. I’m futher behind than I thought.    However, this notes post is about preparing the images correctly. So let’s get on with that.

 

#20/05/2016-11:54:47 GMT+1:

Working thought more of the Ck20 images, there are a few that I’m struggling to work out what went wrong with.

 

Moved around a bunch more things in CK20. It’s a big thing. It’s also looking increasingly clear that I need to write some code to do a direct language swap.

 

Leave a Reply