A Fine Reader (of PDF files in particular)

Like most academics, I rely increasingly on online resources and there are now millions of pages of old journals, magazines, newspapers, etc. accessible online. In most cases, the files are stored in PDF (Adobe Acrobat) format, and optical character recognition (OCR) has been used to turn the scanned images into searchable text. Much time is saved, but a couple of problems plague me (and, I assume, many other users of these sites). Firstly, the OCR text is seldom proof-read and can be of very low quality (as a rule with OCR, the lower the quality of the original image, the lower the accuracy of the final text – and a lot of the resources I use have texts scanned from old printed materials or from microfilms of old printed materials). Secondly, in many cases when you download a PDF file to read or refer to later, the OCR text isn’t part of the file – all you get is the image.

I use Adobe Acrobat Pro to create, edit and read PDF files (I know it’s expensive, but the education pricing makes it affordable). It has its own inbuilt OCR capability, but it doesn’t seem very accurate and there’s no easy way to edit the images that comprise the PDF (for example, to improve their contrast and thus the accuracy of the scanning). So, recently I’ve gone back to a program that I’ve not used in years, ABBYY Finereader 12 (again, not cheap, but having bought it in the past, I was able to buy an upgrade at the educational price).

As with most decent software, there’s a free demo version so that you can test it (and it runs on Windows and Mac). After playing with it for a few days, I’ve found it’s extremely accurate, even with old, fuzzy texts, and it has a couple of nice features that Acrobat Pro lacks. Firstly, it has a built-in image editor, so if your scanned image has dark edges, or other marks that confuse the OCR, you can delete them before you start. You can also eliminate big white borders (a pain when you’re trying to view your PDF at “page width” and want the text nice and readable). Even more usefully, FineReader lets you adjust settings in great detail; you can, for example, boost the contrast in the image edit the levels (to eliminate a grey background), or deskew the image – and then choose whether to apply the edit to all pages, the current page, or a selection. And FineReader can also pre-process all or some of the images automatically and – unlike Acrobat Pro – you have quite a lot of control over what it does when pre-processing (you can select options such as “reduce noise” or “whiten background”, for example). The result, in my tests so far, is close to 100% accuracy for most of the PDFs I’ve converted (and of course you have the option to verify and correct the text before saving, if you want to). And it’s fast: on my PC, a 20 page PDF file is converted in under 10 seconds.

And, of course, FineReader can do all the normal stuff an OCR package does, like scan pages directly to the format of your choice (Word, Excel, PDF, etc.).

So, if you need to do this kind of thing with documents, I recommend this very highly; over the course of the book I’m currently researching, I expect it to save me hundreds of hours.

UPDATE [15 September 2016]

I recently hit a problem with Finereader; whenever I tried to start the program I got an error that said “ABBYY licensing service is unavailable. The RPC server is unavailable.” I contacted ABBYY’s online help and after a couple of very quick emails they were able to solve the problem (you open Windows Services, select Finereader, and change the startup type from “Automatic” to “Automatic (delayed start)”. I was very impressed with the speed and efficiency with which ABBYY’s technicians resolved the issue. If you’re facing the same problem, there is more information on their website.

End of the affair (or how I learned to love an iPad)

My Surface Pro 3 got stolen last year and I decided to wait for version 4 to replace it. After reading various reports of initial teething troubles, I waited until the first major firmware update (a couple of weeks ago) before taking the plunge.

I won’t bore you with the details of the 48 hours of hell it took me to get it working; just be glad you weren’t there. When I finally got it going, I found the battery life was simply awful: 2.5 hours on a full charge, without doing anything very demanding (no games, no video editing, no mega downloads). I also found that the handwriting recognition (which is what I use it for most) worked less well in Windows 10 than it had with Windows 8.1 (for example, the text entry box is now one line instead of two and cannot be resized). And, since handwriting recognition is part of Windows, there are never likely to be many third-party handwriting apps (not that there are many Windows apps of any kind).

So, much to my surprise, I found myself reading reviews of the new iPad Pro…

 I have never been much of a fan of Apple anything, but I must say it was nice to be able to go to an Apple Store and just play with an iPad for as long as I wanted. It was the feel of the Apple “Pencil” that I wanted to experience. (They couldn’t call it a stylus, of course: Steve Jobs would rise from his grave, and what would Tim Cook do for a living then?) Despite the silly name the Pencil felt very nice to write with: a slightly softer tip, with a little more fiction, would be even nicer, but it seemed good.

So, I took the plunge and have now been an iPad user for 3 whole days. And, so far, I am very happy. The key to my happiness is a free utility called MyScript Stylus. I have been using the Android version on my Samsung Galaxy Note 4 for a while and it’s very useful, but on the iPad’s massive screen it is an absolute delight. It’s fast and accurate and apart from a slight “tappity tap” noise as I write, it’s a real pleasure to write with. Some apps seem to dislike it (Chrome hates it and Safari is unsure), but OneNote – which I use most – has not had a single problem so far (touch wood). I suspect that if I were user and got the Penultimate add-on. I would be even happier, but we will never know as I prefer OneNote. I hope that future versions (or rivals) will make it even better (editing text is a bit cumbersome, for example), but it’s OK for now.

And the iPad Pro’s battery lasts much, much longer than the Surface’s does. I haven’t been able to verify Apple’s claim of 10 hours, but I have managed about 6 without getting near the end of the battery life. And it really is a pleasure to watch TV on it: great picture and impressive sound. (Shame Arsenal couldn’t manage a goal, but I suppose Apple can’t be blamed for that.)

I will try to update this one I’ve been using the iPad for longer.

Almost in love (with Microsoft’s Surface Pro 2)

Surface-Pro-2I write for a living. When I’m not writing books and articles, I am writing notes for future books and articles, or I am writing emails and reports, or comments on my students’ work. And, for the last 20 years, I have been doing almost all of that on a computer, mostly a desktop PC. Like everyone else in the world (with the possible exception of members of an isolated tribe in the New Guinea highlands who have yet to see a white person, other than Lady Gaga), I have been reading reports of the death of the PC for the last couple of years. Apparently sales are plummeting because everyone is buying tablets.

Now, I’m a man who likes a nice gadget and my curiosity usually outweighs both my commonsense and concern for the environment. Also, I have a new role at work, which involves going to a lot more meetings. I make notes in these meetings, using a pen and paper (younger readers, if any, may wish to research those terms), which is quick and easy, and doesn’t distract me too much from what people are actually saying in the meeting. The problems begin when I try to find my notes later so as to remember what I had promised to do in respect of the various “action points” that arose.

At times my aching eyes, back, neck and arms lead me to suspect that I spend too much time sitting at a desk in front of the PC. So, to try and vary things. For example, when I’m reading for research I sit in a nice comfortable armchair, or stretch out on the couch, with a book or a print-out of an article that I’ve downloaded, a pen, and a pad of Post-It notes. Much nicer. But when I’m done reading I have to go back to the computer and type up my notes, otherwise I will eventually lose the book or the Post-It notes fall out. And there are other problems. I was re-reading my copy of J.B.S. Haldane’s Daedalus, or, Science and the Future (1924) in preparation for a class a few weeks ago. At one point, Haldane comments that “To light a lamp as a source of light is about as wasteful of energy as to burn down one’s house to roast one’s pork”. Attached to this is a Post-It note on which I’d written “roast pork”. – Elia (Charles Lamb) – also mentioned by J.S. Huxley”. Well, as Robert Browning almost said, when I wrote that only God and Jim Endersby knew what it meant, but now, only God knows. In slightly over five decades of fairly steady (albeit uneven) use, my brain seems to have developed a few bald patches, where it no longer grips as it once did. Colleagues comment approvingly on how quickly I reply to emails, but the truth is that if I don’t do things immediately, I forget to do them at all. I rely increasingly on the computer to remind me of things – names and dates and my friends’ children’s birthdays. And my assumption is that this need will become greater in time.

So, a few weeks ago, after much research, I bought myself a tablet computer and ended up with a Microsoft Surface Pro 2 (slightly to my surprise). Anyone who does a lot of writing with a computer and wonders how much of it you might do with a tablet (any why you might want to switch), may be interested to know how I arrived at the choice, and what I think the pros and cons of this tablet (and, to some extent, tablets in general) are.

What, no iPad?

We already have an iPad in the house; the kids love it, especially because they can make movies on it; iMovie and Edumotion (a very simple, easy-to-use stop-motion animation program) are wonderfully easy to use and the business of making visual content is genuinely intuitive, especially for the “pointer” generation (see Jennifer Egan, A Visit from the Goon Squad. Great book.) However, the iPad is visual; when it comes to words, it is mainly for consuming rather than creating them. Personally, I find the little on-screen keyboards on gadgets like this an absolute pain. You could, of course, buy a little tiny (and fearsomely expensive) Apple keyboard, which gives you a little tiny (and fearsomely expensive) laptop, that is woefully underpowered and awkward to use. No thank you. I have a laptop and when I want to use a laptop I prefer a 13” screen and a full-sized keyboard. However, the problem with laptops is that the screen is too close to the keyboard, which exacerbates the neck-ache problem. This is a technology, like the PC itself, that bends you into an odd shape so that you can use it.

Take Note(s)

clip_image001Apart from my “real” PCs (laptop, home and work desktops) the gadget that I use most is my phone, a Galaxy Note II, which is expensive – especially for someone who rarely makes phone calls. But I use it everyday as a diary and to check emails. It synchronises all my appointments and my address book with my other computers and the screen is big enough that I can actually read it (my eyes, like the rest of me, are nearly 53 years old, and my relatively new varifocals make reading most phones a challenge).

However, the really exciting feature about the Note that it has a stylus. Steve Jobs (“Most overrated individual in history?”, discuss.) once famously said that “if you see a stylus, they blew it”. A comment aimed at the old enemy, Microsoft, who introduced their tablets and “pen computing” platform back in 2000.[1] When it comes to styluses, I beg to differ (and when I’m as rich and famous as Steve Jobs was, no doubt people will actually care what I think). I simply cannot write emails or texts on a phone-sized screen using their on-screen keyboards (not, at least, at anything like the speed I can think. Even though that’s slower than it used to be). But I can write with the Note’s stylus and it does a pretty good job of turning my horrible, illegible scrawl into recognisable text. (Especially if you replace the installed Samsung handwriting recognition app with MyScript Stylus which, even though it’s still in beta, does an even better job.) Thanks to the Note and its stylus I can sit on the couch, read my emails and write short answers, instead of having to go upstairs, switch on the PC, wait, and then spend even more of my day sitting at a desk.

The only problem with the Note is that the screen is too small for extended writing. For a phone that I mostly use as a diary, it’s a sensible compromise, but it made me think that maybe I needed something larger.

Taking the tablets

My first thought was simply to buy a bigger Note; Samsung make 8” and 10” versions of it. I played with both (thanks to PC World, almost the only shop in Britain that has working versions of the gadgets it sells on display, so you can actually try them). I was not persuaded. The 8” is too small and the screen resolution of the 10.1” seemed a bit too low (everything looked fuzzy). Samsung announced a new 2014 edition a couple of months ago. It took me weeks to find one and I was disappointed; fairly expensive, the handwriting recognition didn’t seem to work as well as the Note II (maybe it just needed some tweaking, and of course in the shop I couldn’t install MyScript Stylus). But the bigger problem was that the tablet (like the phone) runs Android.

There is nothing wrong with Android, as an operating system for phones at least. It works fine and it does what it’s supposed to. My problem is that it’s an open platform and everyone does their own thing with it, so Samsung’s version of Android is not only different to everyone else’s but it varies slightly between their various gadgets. Also, Samsung are not primarily in the software business. As a result, they have no real interest in making apps/programs that will run on all platforms, nor in regularly updating and improving them. Once you’ve bought one of their gadgets, you are of no interest to them until you’re ready to buy another one. My other problem with Android is that I really rely on two bits of Microsoft software: OneNote and Outlook. I won’t explain here why I prefer OneNote to Evernote (some other time perhaps), but I’ve tried both and prefer OneNote. One of many reasons for that preference is that it’s really successfully integrated with Outlook; I can move items back and forth between the two easily and I find it simpler to organise my work that way. Outlook doesn’t exist at all on Android; I use a program called Touchdown instead, which is pretty good but lacks the tight integration with OneNote (and between its various components) that I’ve come to rely on. OneNote does exist on Android, and it’s usable but missing half the features of the full-fat Windows version. Given that Microsoft want everyone to buy and use Windows, they don’t have much interest in making really good Android apps. Neither Google (who make Android), Samsung, or Microsoft are to blame for this; it’s the remorseless logic of late capitalism, but I can’t wait for the revolution before I sort my tablet needs out. (Although it might be cheaper if I did.)

iPad and other alternatives

Both Outlook and OneNote run on some Apple gadgets, but you can’t get the full versions on iOS, which is what the iPad runs, and the iPad doesn’t have a stylus. You can, of course, buy a stylus for an iPad, but that would be a capacitive stylus, which means it’s basically a big, fat, artificial finger. The Samsung Note’s stylus is an active digitiser. You can learn the difference (and why it matters) from a nice clear post on Michael Linberger’s Blog, but basically an active digitiser is more accurate, it will detect your palm (and ignore it), so that you can rest your hand on the touch screen while you’re writing, it has a fine tip that produces nice, fine lines, and – if you’re at all artistic – they’re usually pressure-sensitive, so if you have a suitable application you can press harder to get heavier lines, etc. For me, though it’s the accuracy that matters: effective handwriting recognition relies on it.

So, by this stage I realised that I wanted a tablet with an active digitizer, a 10” (or better) high-resolution screen, and I wanted to run Outlook and OneNote, which meant I would have to buy a Windows tablet. That left me with five choices: Microsoft’s Surface; Sony’s Tap 11; Lenovo’s ThinkPad Tablet 2; or, the newly announced Dell Venue Pro 11.

Quick reasons for rejecting all non-Microsoft choices:

  • Sony is as expensive as the Surface but has slightly worse specifications, comes with a detached (and largely useless keyboard), and is rather flimsy. Reports of the active digitiser are mixed.
  • ThinkPad is too old, has a low-res screen, and there’s no sign of a new model.
  • Dell looks very attractive but not yet available in UK and I was in a hurry. It may be worth checking out.

So, that left Microsoft. There are two models: the cheaper is the Surface 2 (which runs what was called Windows RT, but Microsoft is now so ashamed of it that it seems to be OSWAN, the Operating System Without A Name). Much has been written about the pros and cons of the Surface 2 and it’s anonymous operating system (most of it fairly negative), but for me it was a non-starter because it doesn’t have the active digitiser.

Finally, the Surface Pro 2

I am not going to write a full review; there are lots online and I read dozens before I made my decision; I found the ones on Techradar and PC Pro the most helpful. However, I will highlight the best and worst features from my perspective:

The Good

handwriting_recognition-11328643%255B9%255DThe digitiser is an absolute delight. Almost 20 years of typing on computers has destroyed my handwriting (and it was never great, as my primary school teacher, Mrs Worcester, used to constantly point out; she had an enlarged photograph of Queen Elizabeth I’s stupendous, italic handwriting on her classroom wall to inspire us. Didn’t work for me. Perhaps my lifelong aversion to our monarchy put me off.) I can no longer read my own writing, but more importantly, I cannot quickly search piles of hand-written notebooks (especially after my kids have “enhanced” them in various creative ways); as a result, I often cannot find the crucial note I need.( Thinking back to undergraduate days, that was why I spent two days at the Sydney Workers Educational Association learning to touch type in the first place.) The Surface recognises my handwriting, and it did so straight out of the box, but better still it “learns” over time, so the number of corrections gradually reduces. You can also spend time training it to improve its accuracy faster.

Size, weight and battery life are all acceptable, but not wonderful. I can comfortably rest it on a lap or knee while writing and it doesn’t get too hot. It feels robust and solid enough to carry around and while I wish it was lighter, it’s only half the weight of my ultrabook (a Dell XPS 13, so it’s pretty svelte).

The best thing, however, is how you can use it. When I take it to meetings , I can pre-load load the agenda and minutes onto the tablet beforehand (less paper to recycle afterwards), then make notes of any key points during the meeting. I do all this in OneNote. Thanks to the magic of SkyDrive (and the fact that I work in a modern university with wifi in almost every room) by the time I get back to my office, my notes have all been synchronized with my desktop PC and I can look through the things I agreed to do, and just drag and drop an item onto my Outlook task list, or turn it into an email. No retyping, no searching my notes, no losing the printed agenda on which I had scrawled “must email Bill about this”.

There is also something minor, but – to me – rather lovely about the way the tablet affects my body language. When people use a laptop in a meeting, I feel as though they are not quite there: they’re “hiding” behind the upright screen, not listening to me (and possibly checking their emails or playing Solitaire). But a tablet can be almost flat on the desk, like a real pad of paper, which makes you look (and feel) as if you’re really part of the discussion. A small thing, but I like it.

My second main use for the tablet is annotating and commenting on PDF files or other documents. I do this more and more, partly to save paper and partly to be able to find my notes. I like to write on students’ work, for example. But if I print it out and write on it, they can’t read my writing, I have to get the physical bit of paper to them, which leaves me with no copy of what I wrote. If I work on the electronic version (which is how most student work arrives these days), I can annotate it (and handwriting recognition means they can read it), keep a copy and email the comments to the students. For PDF files, I use Adobe’s Acrobat Pro XI (again, tried other things but this is best; as long as you’re only paying the educational price). For Word files, I just use the built-in comment and “track changes” features. Neither is perfect, but they are each a big improvement on the hand-written alternative. Similarly with research notes; I can save them and search them at a later date. And it means I can take any number of documents with me on a train, plane or to a café. Read them. Make notes. And once I’m back in range of wifi my notes are all synchronised and available on my other computers.

My third main use for the Surface is to make notes when I am reading an actual physical book. Instead of having to prop the book up on a stand by the PC and give myself a crick in the neck twisting from book to screen and back, or use great handfuls of Post-It notes, I can now sit in an armchair, book in hand, tablet beside me and use OneNote to make notes as l go along. Not only can I read and search my notes, and copy key points into my documents, but l am already saving a lot on Post-it notes. Obviously, since the Surface costs £799 I will have to save a lot of Post-it notes before the gadget pays for itself. And then there’s the electricity. And the broadband. But as I plan to live to be about 150 and to read a couple of books a day, getting through at least 100 Post-It notes per book, l figure I will come out ahead. Just.

And, of course, the Surface also does email and web browsing, reads ebooks, etc. And apparently you can even play Solitaire on it. Not that I would know, of course.

The Bad

clip_image003So much for the good news. The not-so-good is that the Surface a classic Microsoft product, which means it’s trying to be all things to all people, which really means that you have to learn to do everything the Microsoft way. (Not for nothing is the company HQ’s official address “1 Microsoft Way”.) The Surface runs Windows 8.1 (pictured) which, in true Microsoft style, is both a dessert topping and a floor wax. You and I might think that it would be sensible to make two versions of Windows: a desktop version for large, non-touch screens that is designed to be used with a keyboard and mouse. (In which, for example, programs are represented by small icons, so that you fit lots of them on your big screen.) And we would produce a separate version for tablets that is designed to work with a finger or stylus. (In which, for example, programs are represented as nice big squares, that are easy to prod with your finger.) Well, you and I might think that, but that’s why will never work for Microsoft. At 1 Microsoft Way they follow the One Microsoft Way, which says that not only does Windows have to run on everything, but it has to be the same version of Windows on everything. I don’t understand why, but perhaps that’s why I’m not a multi-millionaire like Bill Gates.

(Windows 8.1 also embodies the other key part of the Microsoft philosophy, which is “If it ain’t broke, break it”, and then 3 months later, release an update that makes the product almost – but not quite – as good as it was before they “improved” it in the first place. But I digress…)

clip_image004Sadly, the one-size-fits-all philosophy infects the rest of the Surface too. In many ways, it’s a beautifully designed piece of technology, but it trying to both a full-blown laptop and a tablet. And so it is a rather unhappy compromise. With the attached Type Cover (another £100, pictured) you have a rather unsatisfactory (and small, and overpriced) laptop. Without the keyboard you have a lovely tablet, but it’s a bit too heavy and hot, etc, and – depending on how you use it – it will only run for about 5-6 hours on its battery.

What is more irritating, is that the stylus integration and handwriting recognition could have been even better, if Microsoft had focussed on making a perfect tablet instead of a slightly uncomfortable hybrid. For example, the handwriting recognition relies on a pop-up panel, but it takes up too much of the screen and it isn’t adjustable. At all. It should be possible to make it smaller (and semi-transparent) so that it doesn’t – for example – completely cover the web form you’re trying to complete. Nor does the pop-up appear automatically whenever you click a text box that required text input. How hard could that be? And, while, I’m whining, why does the OneNote app (i.e. the version that is supposedly designed for tablets) not allow you to simply write on it and recognise your writing? Instead, when you write you get a picture of your handwriting without even the option of converting it to text. This is seriously dumb. As a long-term user of Microsoft products, I often wonder if anybody in the company actually uses them.

The ugly?

Yet, despite my complaints, I am almost in love with my Surface. Why? Because, unlike almost all the gadgets I own, or have ever owned, I feel it is adapting to me, rather than my having to adapt myself (and my aching back) to the technology. This gadget actually makes life easier, for me at least. Obviously, it’s a device that is still evolving. I dream of a Surface 3 that weighs about 300g less, is about half as thick, runs for 10-12 hours on its battery, and has a 12” screen. It would also have the option of built-in 4G mobile networking. And a slot in which to store the stylus. Oh, and it would run “Windows Tablet” which combines the best features of Windows 8.1 with the best of Windows Phone. And please could mine be purple (the whole “any colour as long as it’s black” thing is so 1908).

But, in the meantime, the gadget and l are pretty happy together.