Archive for June, 2012

Finally got the courage to bite the bullet and switch the domain registrar from 1&1 to Nearly Free Speech. I have been forwarding the domain to bililite.nfshost.com for 5 months, but I was afraid that there was something I was going to miss in the transition, or that I would lose an important file or something like that. Transferring the domain means closing the account with 1 and 1, deleting everything, so it was irrevocable.

Transferring a domain involves two steps, first releasing it from 1&1, then telling NFS to take over. Releasing it took a while, and since I had private registration I had to call 1&1 customer support to get it finally done. The correct steps are, as far as I can tell:

  1. Make sure the NFS website (like bililite.nfshost.com) correctly mirrors the original website, since you can't control exactly when the switch will happen and you hate to go down.
  2. On the 1&1 control panel, select Domains then select the check box for the correct domain.
  3. Select the Contact menu above the list of domains and select Private/Public Registration and make the domain public. If you want to keep it private even for the week or so of the transfer, skip this step but call customer service to have one of the administrators approve the transfer. NFS calls this inability to transfer a privately registered domain "Extortion by Proxy"
  4. Under Contacts, select Show Domain Contact Details and make sure those are right. If not, you'll never get the email to approve the transfer.
  5. Click the Info button and select Unlock Domain and copy the Auth code.org code; that's the domain password to allow it to be transferred.

Now on the NFS side, log into your account and go to https://members.nearlyfreespeech.net/{yourusername}/domains/transfer?init=1. They will give you a to-do list before transferring, but you've already done all of that except the name server part, which is easier to do after transferring the domain. Enter the domain name and follow the instructions, they're straightforward. You'll need the Auth code from above.

Wait a week. At some point, you'll get an email asking you to approve the tranfer. Do so.

Change the nameservers (use "Set up DNS and name servers automatically").

It all seems to work well, and I'm now completely running on my new host!

The PHP routines to fill in PDF forms work great, and now my partner wants to use them too. Changing the forms to make the physician name a fill-in field rather than fixed text is easy, but what to do about the signature? It's not like a check; a pixelated image would be fine (I'm not worried about someone forging a preschool physical exam note). But it's not that easy to insert an image into an existing PDF file. I don't want to parse the entire PDF and rewrite it.

Fortunately, images are stored as individual objects in the PDF file, so if there is an image I can identify, I can easily replace it with one of my own choosing. The placement and size of the image on the page is part of the page description, so the new image will be in exactly the same place as the old.

Continue reading ‘Changing Images in a PDF’ »

I use a lot of forms at work. The more paperless the office gets, the more paper we generate. Every school has its own physical exam form, every government agency has its own application form, every screening test is another form for the parent to fill out. And my handwriting is atrocious. So I try to get PDF copies of everything, then use PDF Escape to add text boxes that I can fill in, and an image of my signature at the bottom. But when filling them out, that still leaves a lot of either typing or cut-and-paste from the EMR (electronic medical record) of the patient's name, birthdate, address etc. There had to be a better way, and one that uses only free tools (I'm not buying Acrobat for $400).

Fortunately, Adobe Reader can run a version of Javascript, and we can use that to help fill in the form.

Every PDF includes a /Catalog object that serves as the root object of the document. Normally it just includes a reference to the array of pages, but it can include other things like Javascript to be executed when the document is opened. The syntax is convoluted; it is a dictionary containing a dictionary containing a string:

0 1 obj
<< 
  /Type /Catalog
  /Pages 0 2 R % a standard catalog entry
  /Names << % the Javascript entry
    /JavaScript <<
      /Names [
        (EmbeddedJS)
        <<
          /S /JavaScript
          /JS (
            app.alert('Hello, World!');
          )
        >>
      ]
    >>
  >> % end of the javascript entry
>>
endobj

That's complicated but the coding part is straightforward: take an existing PDF, open it in a text editor and find /Catalog and insert the boilerplate after the /Pages reference, and put in your code. PDF is smart enough to match parentheses, so as long as your code pairs them correctly (you don't have any strings like "We love smileys :)") you don't have to escape them. If you need to, escape them with a backslash. Actual backslashes in your code need to be escaped (write them as \\, since the PDF parser will read the string before interpreting it as Javascript.

This will create an incorrect PDF file, since the xref table no longer has the correct byte lengths. Adobe Reader will correct this automatically, as will PDF Escape, but they compress and otherwise munge up the code so it's impossible to further edit.

See a sample blank page that says "Hello, World".

Continue reading ‘Adding Javascript to PDF Files’ »

This is dumb: I'm typing away on WordPress's fullscreen mode and all of a sudden my keyboard goes haywire: the double quote is now an at sign; the backslash is now a hash. The hint to what's wrong is that the hash sign (shift-3) is now a British pound sign. WordPress thinks I'm using a British keyboard! Other applications work fine. Searching for anything like "Wordpress British keyboard" turns up nothing. It's not until I try typing special characters in the address bar that I realize it's not WordPress; it's Firefox.

Turns out Left Shift + Left Alt is the keyboard switcher, and I had both the US and British keyboards listed as available input methods. I must have accidently hit that at some point. It's not clear why only some applications were affected (I think the keyboard is whatever the setting is when that particular application starts). Anyway, Control Panel->Regional and Language Options->Text Services and Input Languages lets me remove the offending keyboard, and I don't have to worry about it again.

Just found another useful tool for manipulating PDF's: the PDF Toolkit. It's a command line tool based on iText that I use mostly for merging PDF's together. The big downside is that embedded Javascript is lost, so that has to be added to the PDF after it has been put together.

So my free (as in free beer and free speech) PDF tools now include:

  • Open Office, for creating PDF's from word processing documents.
  • PDF Escape, for modifying and adding fields (text, checkboxes). This does preserve Javascript code, but compresses everything so you can't edit it further.
  • pdftk, for merging PDF's.
  • tcpdf, for creating PDF's with PHP.
  • A good text editor and a thorough understanding of the PDF specification, to hand tweak.

The PDF specification is very particular about byte lengths of each element, with a table at the end that specifies exactly where in the file everything is, but the most recent Adobe Reader is pretty forgiving (a millisecond alert pops up that it is trying to fix the file). That's important if I'm hand-tweaking a PDF, since I can't correct the cross reference table. PDF Escape, fortunately, will correct everything, so if it's important I can just upload the tweaked PDF and download it back.

I use Bing for the search box on bililite.com, and it's worked well; simple API, no need to create a custom search engine as with Google. Unfortunately, Microsoft is losing almost half a million dollars an hour on Bing, and they want me to make up the difference. Well, not me alone, but they are going to start charging for using their web services. Fortunately, they are (as of now) providing a free tier of up to 5,000 queries a month, which is far more than I need.

So I have to sign up for Azure Marketplace (Azure is Microsoft's cloud service) and Subscribe to the Bing Web Search API and create an application key. Then I need to convert my old requests into the new format. Luckily, Microsoft provides a migration guide (as a Word document!), and that includes sample code in PHP. The biggest difference is the need for HTTP authentication. The code from Microsoft works, as long I leave out the proxy line in the context parameters (I guess they only tested their code on local servers) and file_get_contents works on URLs, which is enabled on my service with Nearly Free Speech. I imagine setting the header similarly with cUrl would also work.

The other big difference is that they no longer return the total number of results if not all of them were returned. Now they return a parameter __next (note two underlines) that contains the URL for getting more results if they are available. Since I'm only showing a limited list, I just need to test for the existence of that parameter to indicate that more results are available.

So the updated code is:

Continue reading ‘New Bing API’ »

I'd like to keep my font files on Amazon S3 and save the cost of storing them on my web server. imagettftext doesn't say whether I can use a URL for the font file, but imagettfbbox unambiguously says:

fontfile The name of the TrueType font file (can be a URL).
Unfortunately, they are lying, for both functions. The font file needs to reside in the local filesystem.

But I pay for storage and I want to have plenty of open-source fonts available for the webservices and that adds up. So I have to pull the fonts from Amazon S3 into a temporary file and pass that to imagettftext/imagettfbbox. But then I need to remember to delete the temporary file as soon as I'm done with it, to minimize the storage cost. The only way to guarantee that (that I know about) is with the old C++ techinque Resource Acquisition Is Initialization—create an object that I know will be destroyed and make deleting the file part of that object's destructor. (We usually don't have to worry about that sort of thing in PHP since the most common resource we use is memory, and the garbage collector takes care of deleting unused memory resources).

I also want to keep track of which fonts have already been downloaded, since I can use the same temporary file.

So the following class uses my CDN class for access to the files, and requires a writeable directory for the cache for the temporary files. $fontdir is the local directory where the class looks for the font files initially, pulling then off the S3 server if not available locally.

Continue reading ‘Using imagettftext with Off-site Font URLs’ »