Handwriting "Synthesis"
Dan Morris, 2002
dmorris@cs.stanford.edu

The last time I had to write something by hand, I realized how incredibly
tedious it is to actually write things by hand.  It really takes like hours
and hours to write just a couple pages.  I wrote these scripts with that in
mind.

Generating arbitrary handwriting was too hard for me, nor would it have
solved the problem of having my computer make something that looked like my
handwriting.  Having the computer look at a sample of my handwriting and
automatically figure out how I make each of the letters of the alphabet would
be ideal, but is also a hard problem.  So this set of scripts represents a
compromise.

Basically you scan an example of your handwriting, then manually define
important points on a few examples of each letter.  The Matlab scripts in
this package can then take the examples and put them together in a slightly
randomized fashion to 'write' a given piece of text.  The script randomly
chooses one of the samples of 'e' that you defined each time it encounters
an 'e', for example.  It also randomly varies the spacing between letters and
lines of text, and automatically generates line breaks at appropriate points.

Then you can print it out and - especially if you print on an inkjet printer
instead of a laster printer, or tell your printer to use too much ink by telling
it the paper is really heavy - you'll get something that looks convincingly like
your handwriting.  You may also want to play with it in photoshop or something
to blur it a little, smudge it a little, add some noise, etc., especially if
you're going to print on a laser jet.

The process I'm about to describe is how you turn that image into an
'alphabet' that can be used to generate arbitrary text in your 'handwriting'.
An example of the whole process (except the scanning) is in
'demo_whole_process.m', a tiny script that shows you what you need to do to
make an alphabet and generate some handwriting.  It should be _almost_
self-explanatory; you probably don't even need to read the rest of this file.

Oh, also, anyone in the Stanford community might notice that the provided 
image is in fact a "Stanford Fund" letter.  For those outside of Stanford,
this is a painful handwriting ritual that student groups go through to get
funding from the University.  It's _horrible_.  It was clearly what inspired
this project, but - I say this to avoid getting the student groups I've
been a part of into any trouble - this project (at the time I'm posting it
on the web) has never been used for TSF.  I expect that it will work for that
purpose, however, should anyone want to know.

Also, another approach to solving the 'TSF problem' is to write two copies of
each letter (instead of 8), scan them at high res, do a quick cleanup in photoshop,
print them each 3 times on an inkjet printer (now I have 8 copies of the letter),
and arrange them in the pile so that identical copies aren't right next to each 
other.  This is another compromise that I expect would work fine for solving the
'TSF' problem.

---

You scan in a sample of your handwriting, which should be on plain white
paper.  The enclosed bitmap file is an example of what the script expects.
The quality of the scan doesn't matter, since only you will actually be
looking at the graphic representation of the letters.

Then you run 'getletters', which opens up the bitmap file you scanned.  It
zooms in on a portion of the image, and you go to work clicking bounding
boxes around each letter (the upper-left and lower-right corners that define
a box around each letter), then typing the name of the letter
(case-sensitive, of course) that you just bounded.  It sounds painful, but
you can do a letter about every second, so it's not so bad.  And you _won't_
do every letter in an image, just a few examples (like 7 or 8 maybe) of each
letter.  You should have more e's than x's, for example, since these will be
the samples that the script has to choose from when it's generating
handwritten text letter on.

When you've gotten some good examples of a couple letters from the current
'zoomed' view, you click twice anywhere on the display (to get the command 
prompt back) (that's sloppy, I know), then hit return - instead of providing
a letter name - to move to a new  view.  When you're done with this whole
image, you press 'q' to quit.  Don't worry if you didn't get some letters in
this file or if you messed up; you can remove letters or merge 'alphabets'
later.

You then run 'getpoints', which cycles through each of the letters that you
bounded in the previous step, and asks you to click a few points that line on
the countour that defines the letter.  About 8 points for each letter is
usually good; be sure to get important points like the ends of lines or
corners.  You'll get a feel for what points are important.  There's a little
GUI with three buttons :

* The red box says 'I'm all done with this letter'
* The blue box says 'I'm all done with this letter, but don't make the 
  default assumption that the baseline of this letter is at the lowest point
  that I clicked'.  You use this for letters that go below the baseline of
  normal text, like 'g' and 'y'.  You'll be prompted to click where the
  baseline is after you click this button.
* The green box says 'This letter isn't drawn with a single curve; I want
  to start a new 'stroke' for this letter'.  You use this for letters like
  'y', which are generally drawn with more than one line.  You wouldn't use
  this for 's', for example.

This script gives you an 'alphabet' of curves that define each example of
each letter in the samples you processed.  You can add more letters to the
alphabet by going through the above process again, on the same image or a
different image, and using 'mergealphabets' to incorporate the alphabets into
a single alphabet.  You can use 'showletter' to look at all the samples you
defined for a letter; this is useful if you generate some text and you notice
that some of a particular letter looks really bad and you need to find out
_which_ example of that letter is bad.  You can then use 'removeletter' to
remove that example from the alphabet.

You can now use 'gentext' to turn an ASCII string into handwriting.  Hooray.
You can also 'filetostring' to turn an ASCII file into a single Matlab
string.  Hoorah.

