The other day, Caroline's brother asked, what are the programs behind these pages? (If you somehow are seeing this document from a non-standard address, "these pages" are the ones at http://technologists.com/photos/.) And the answer to that question is basically that "there are only a few programs, they are simple and short". The programs were the easy part of a project that is now approaching four years old. So let me step back and describe things with two objectives:
Since it was several years later before I did anythng at all with this project, I'm not sure how much of this was sub-conscious percolation, how much was waiting for technology (especially, scanners) to improve in price and peformance, and how much was other factors that I can't directly identify. I do know the following:
First, I scanned all of the 35mm slides we had that I thought were worth preserving. This was partly because they were older (Caroline and I basically switched from slides to negatives/prints in 1984) and partly because I had no faith that our slide projector would persevere. (The projector is a high-quality Kodak from the early 70s, but what can you expect? When I used it last week, it would not advance slides at all -- I had to manually advance slides when I wanted to "rotate" the Carousel.) I put the images in directories named by the external numbering of the Carousels, 1-20, knowing this was temporary, and named the images by their numbering within the Carousel. For example, the picture of Margaret and Ray had the name http://technologists.com/carousel3/72-MargaretandRay500x750.jpg. (I've broken one of my own rules, that once an URL is created, it should never disappear, so you won't find that URL if you try to access it.)
Naming/numbering by Carousels made no sense to anyone but me, so I had to change. The obvious change was to base directory names on dates, which is what I did. Further, the dates needed to reflect reasonable granularity of numbers of photos, my ability to guess at dates, etc. So there is an 1800s directory for anything from the 19th century (delightfully, that directory is far from empty), there are directories by decades for the 20th century up through the 60s (some of the earlier decades, e.g., the 1940s are more interesting to me than the 60s directory), and directories by year starting with 1970.
Directory naming was the easy part. File naming was/is the hard part. If I scan the image myself, and recognize all the people and other aspects of the image, then the goodness or badness and/or inconsistency of names is all my responsibility. I have several nicknames, so do I use my intials, CHS, or my proper name, Charles, or one on my nicknames to identify myself? The answer may seem obvious to the question I just posed, but in the context of a specific image, the answer isn't so obvious. Any references to myself are much easier than some of the other cases. For example, Caroline's father is Charles W. Abbitt, his son is Charles W. Abbitt II, and his grandson is Charles W. Abbitt III. Caroline's father is a retired Air Force Colonel, so in some contexts that should be recognized. (For example, in a family photo, it is easy to ignore his military title, but in a military photo, it is not.) Caroline's brother and his son have nicknames that they accept in family context, but maybe are not so fond of in non-family contexts. Even more complicating, many of the adult women have family names plus one or more surnames of their (ex-)husbands. How do I chose names/initials for them?
Finally, with the advent of digital photography, the camera is likely to pick an inscrutable file name, like SCN0743.jpg. Do I accept that name as it is, or do I try to change it to be more meaningful? That particular photo is of my daughter and I dancing at her wedding, so it is especially hard for me to leave that name unchanged. The name I chose is 03wAMDSCN0743.jpg. The "03w" prefix indicates the year (03) the wedding (w), the AM indicates the photographer (Amanda Measles) and the rest is derived from the name the camera used.
At first, http://technologists.com/photos/ was basically just a list of links to directories, e.g., 1800s, 1940s, etc. Clicking on one of those links takes you to an index page with "thumbnail" size images of the photos in that directory. Clicking on the thumbnail takes you to a page showing a "small" version of the photo. That single photo page also gives you a link to the "fullsize" version of the photo, i.e., the highest resolution version I have. If I scanned a 35mm slide or negative, than the scan was usually done at 2400 dots/inch, meaning a horizontal image would be 3000 by 2000 dots. If I scanned a print, then the scan was usally done at 600 dots/inch (the maximum capability of my flatbed scanner), so a 4x6 print would end up as 2400 by 3600 dots. The "small" versions are 25% of the dots of the fullsize, e.g., 750 by 500 for the typical horizontal slide, the "thumbnail" is reduced again by 25%, e.g., 188 by 125 for the typical horizontal slide. Finally, there are "pinhead" versions that are 25% of the thumbnails, e.g., 47 by 31 dots for what started out as a 3000 by 2000 slide.
Why all these different sizes? The fullsize images are basically for making prints, though they can be viewed on a large enough monitor. The "small" size seems to be the most natural for viewing on a computer monitor. The thumbnail size seems most natural for looking at a page with a lot of the pictures. However, at dialup speeds, a page with all the images at thumbnail size tnindex.html takes too long to load. The page with all of the pinhead images, index.htm is more reasonable for loading at dialup speeds.
The first program, now named mkindex.pl is a Perl program which starts with the fullsize images for any given century/decade/year and creates all of the smaller images and associated html files. For example, when run directly, it might be run as
cd 1940s
perl ../mkindex.pl 1940s
Usually, I don't use mkindex.pl directly, but either use
dirs.sh with a list of directories with new pictures,
or alldirs.sh which goes through all of the
century/decade/year directories.
Both of those will also run the programs that excerpt, based on the list in
index.htm, the photos specific to a specific family, e.g.,
mkAbbittFamily.pl excerpts Abbitt Family
pictures,
or for a specific person, e.g.,
mkCAS.pl excerpts the photos including Caroline
Abbitt Sauer, and the obligatory
mkCats.pl that makes the excerpts for our cats,
past and present.
In some sense these extraction programs are the most difficult, because
they have to have heuristics to include the different names a person
might be known as, e.g., Caroline, C-line, CAS, to exclude pictures
that don't belong, say, Johnson family pictures that don't belong in the
Abbitt Family excerpts, and to exclude
most of Liz and Randall's wedding photos while including a few particular
wedding photos.
I hope all of the above is understandable to those who wanted to know this stuff. Let me know what is confusing, and I'll try to clarify. (I may also say, "I know I don't have that right yet, but I intend to fix". For example, I know of problems in the inclusion/exclusion heuristics where some pictures are being excluded that should be included, and vice-versa.)