Sunday, April 30, 2017

MHGS Digitizing Project -- Choosing Our Platform

We've known since we started the redesign of our society website that one of the purposes was to display digitized resources.  Unfortunately, we found that WordPress, while terrific for all our other purposes, wasn't really right for a digital collections home.

Why not?  Well, first consider what we wanted to do.  In our minds, a digital collections home needed to host a variety of digital file types -- documents, photographs, sound and video.  We needed to be able to attach a significant amount of metadata to these files -- labels with the people, places and things included, plus information about where the items came from, copyright information, etc.  All this needed to be searched easily from within the site and from Google.  Everything had to display quickly and with a minimum of fuss.  It had to handle lots and lots of files -- we have thousands of photographs alone.  And finally, we wanted it to look at least semi-professional, which we think will help convince local organizations to let us digitize their archival materials.

So what were our problems with WordPress?  First, terrible search.  Sorry WordPress, but it's true. The site searching capabilities are awful.  Second, WordPress isn't really set up to handle a database of images like we wanted. We didn't want to write individual posts about each photograph, which would take forever.  The album plug-ins we found were targeted more for art photographers, so the image display was lovely, but didn't handle the metadata we wanted to include.  And nothing seemed ready to scale to thousands of images, videos, documents and recordings.  We couldn't figure out how to make a WordPress option look professional.

So then we started looking in the archive community.  Once we ruled out the options we couldn't afford (PastPerfect), we were left with DSpace, Greenstone, and Omeka.  All three are open source programs, which means that the software is free, and targeted toward the academic archival market.

Greenstone was the first program I installed and tested.  At the time (mid-2015) it appeared to be the least supported and functional of the programs.  It was a possibility until we found something we liked better.

DSpace  is probably the most widely used among the big boys.  It actually seemed a bit too big for our purposes.  (Frankly, it intimidates the heck out of me.)

Omeka was kind of the Goldilocks product for us. Although it is used by professionals in the field, it is explicitly designed for those with very little technical experience.  There is even a version you can use for free without having to install it on your own website, although you lose some control.  It is actively under development and there's a good user-support base.  It's not perfect, but it's what we selected.

Monday, November 14, 2016

Planning a digitizing project

Our society has thousands of photographs stuck in boxes and file folders.  We decided we needed to locate them, digitize them, and put them in archival storage.  Such a simple idea.  Such a hard thing to do.

After almost a year of planning, we have finally started our project.  I thought I'd write up some of what we encountered -- perhaps we can cut some of the work for other societies?

The issues we had to address:

  • What are we going to scan?
  • How do we scan the items?  What kind of scanner?  What resolution?  What workflow?
  • How do we manage the digital files?  How to edit?  Metadata?  How to display on our website?
  • What is the impact of copyright and privacy laws and norms?
  • What do we do with the physical items after scanning?
  • Should we just scan the items that have been donated to us, or should we develop an action plan for identifying and acquiring/borrowing other items?
At the time we started creating a new society website, we formed a committee and had meetings.  It didn't work very well.  For this project, we had one person who pushed the project (me, the librarian,) one person who handled the technical issues, a group of regular library volunteers and patrons who acted as a focus group, and the board, who made the policy decisions.

For a project like this, I find it helps to start a "policies and procedures" document at the start.  At first, all you can add are the major section headers, but as you research and experiment and decide, you start filling things in, so it always reflects your current understanding of the project.  Ours is in a Google docs file accessible to the librarian, the tech guy, and the president.  At several points during the year, I have printed it out and distributed it to the board.

Sections so far:
  • Overview -- our goals and some guidelines
  • Process -- soup to nuts, from scanning to putting on the web to storage
  • Photograph Scanning Standards (there will also be a document scanning section)
  • Metadata Standards
  • Copyright and Privacy Standards
  • Contributed Items (policies about items we don't own)
  • Appendix A: Julia's Notes on Copyright and Privacy 
  • Appendix B: Resources

Sunday, December 14, 2014

Book Review: The Invisible History of the Human Race

Wow -- what a great book!  I just finished reading The Invisible History of the Human Race by Christine Kenneally.

It's kind of hard to explain what the book is about, exactly.  Kenneally approaches the idea of genetics by starting with an overview of human ideas about inheritance, ancestry, race and genealogy.  In some of her early chapters, genealogists help the bad guys, and an in some of the chapters, genealogists help the good guys.  She looks at situations where an obsession with bloodlines leads to genocide, and she also looks at situations where the denial of ancestral information is used as punishment.  I learned some very interesting and disturbing things about the eugenic movement.

Then she moves into history of DNA testing, and the expansion of what can be tested, who is tested, and what kinds of things can be learned.  There's a lot of good background in here about the differences between deep history and recent generations, between Y, mtDNA, and atDNA, and between medical testing and genealogical testing.

One thing I appreciated was that each chapter explores one self-contained idea, while fitting nicely into the overall structure.  This makes it good for bed-time reading, and for making a topic this complex digestible.  I also appreciated the mix of personal anecdotes and scholarly research.

This is not, repeat not, a book that will tell you what DNA test to take or how to interpret your results.  It is, however, a good book for getting a sense of the forest of genetic testing before you start losing yourself in the trees of centimorgans and IBD vs IBS.

Overall, I enjoyed this book immensely, and highly recommend it.  And not just for genealogists.

Saturday, August 30, 2014

The National Institute of Standards and Technology Digital Archive

This is a site that sounds dull but turns out to be fascinating...The National Institute of Standards and Technology (yawn, right?) is digitizing its archives -- publications and photographs.  And the photo collections include their collection of aeronautical instruments and testing procedures, appliance efficiency testing projects, a collection of atomic clocks, automobile testing, and photos of the 1939 project to figure out how to preserve the original copies of the Declaration of Independence and US Constitution.  I didn't have the nerve to view the collection of dental research photos.  There are pictures of crystals and glass plate photography and space beads and, well, they're up to more than 150 photo collections.

In short -- if you have scientists or engineers (or dentists) in your family, you might well find a photo of them, or of tools and instruments they might have used, in this collection.  And, as the Legal Genealogist is always reminding us, photo collections produced by US government agencies are generally copyright free!

Herbert J. Reed of the Electrochemistry Section measuring specific gravity on a battery

Monday, February 24, 2014

Book Review: Finding Family: My Search for Roots and the Secrets in My DNA

The last genetic genealogy book on the list!  Finding Family: My Search for Roots and the Secrets in My DNA, by Richard Hill isn't really a how-to book.  Instead, it's a narrative about one adoptee's search for his family.

The good: This is a story with a couple of happy "endings," as Hill was able to identify both biological parents and develop good relationships with family.  The search starts in the days of phone calls and letters and ends with internet searches and DNA, and demonstrates the wide net an adoptee must cast for any possible clue.

The bad: Not to diminish Hill's work, but his case seems relatively easy -- it appears that he was just about the only person who didn't know, the adoption was handled through family connections as opposed to an agency, and only the government seems to have been trying to hide it.  A huge percentage of the challenge came from logistical issues caused by the passage of time, such as tracking down people who had moved or died, rather than outright secrecy or lost records.   And, it appears that a bunch of the work was done by other people, largely a volunteer with an adoptee support group.

My takeaways:  This is a huge emotional minefield, not to be entered lightly.  Adoptee support groups are really important.  And we really need to change the laws around adoption and official records; I find the idea of certifying a falsified birth certificate repugnant as both a genealogist and a citizen.

I would suggest reading this book as a way to prepare for the mental and emotional components of an adoptee search, but not as a handbook for learning techniques for conducting such a search.

Saturday, February 22, 2014

Don't Forget to Look for Digital Books!

After receiving an email mentioning a 987 page (!) genealogy of one of my families, I was feeling sad that it's out of print.  Googling didn't find anything helpful, and suggested helpfully that the nearest copy was in a library 187 miles away.  But...clicking on the Editions link in WorldCat brought up a list of 4 editions of the book, one of which was said to be digital.  And clicking on that brought up a link to the Hathi Trust website, where the entire book is digitized, searchable, and free for anyone.

Lesson: is your friend.

And, if you're wondering, the book is The Basye Family in the United States, by Otto Basye, and the link is

Oddly enough, while Google didn't find the digital version of the book, it did find a description of Otto's papers, which were donated to the State Historical Society of Missouri.  Apparently there are approximately 4 shelf-feet of materials, largely the research for the book.

Thursday, February 20, 2014

Book Review: DNA USA: A Genetic Portrait of America by Bryan Sykes

DNA USA, by Bryan Sykes, is the next book in my genetic reading pile.  It's a lot bigger than the others, and some of it is worth the extra heft.  The book is about Sykes' attempt to follow up a project mapping the genetics of Britain with a project mapping the genetic history of the United States.  It's divided into three parts:  a review of the science and the history of the science, a narrative about the "road trip" he took while working on the project, and a very quick review of the results.

The good:  The first section of the book is very interesting.  There's a lot more here about Native American and African-American deep genetic history than I've seen in the other books I've read, and it's very easy to understand.  Sykes writes just enough about his British projects to ground the reader and give a sense of why the USA version of this project would be hard.  There's also some good history of the relationships between Native Americans and genetics research, and African Americans and medical researchers, that shows why large segments of the US community might not find DNA testing to be a good thing, and might justifiably consider it a very bad thing.  And, finally, some of the results he gets, placed in context with the results of other studies, illustrate some very interesting points about race in America.

The bad: The second section is a drag.  Sykes can't seem to decide whether it's a story about taking a road trip with his son, who's about to start college, or a story about how depressing Indian reservations are, or a description of a genetics project.  As a result, it mostly fails in every respect.  In the first case, he seems to have missed the fact that the trips in the great road trip movies he keeps referencing were, in general, taken by car, not by train.  Descriptions of train stations do not a great road trip story make, even if he tells us what's on his son's iPod. Second, while it's useful to learn about the complicated and messed up history of Native Americans and genetics research, it's not really interesting to read about tours he took in which he neither discusses genetics research nor conducts any while on Native reservations.  And finally, after making a nicely convincing case in the first section about how hard a genetics study would be in the US, he does what couldn't even be called a half-assed job of gathering samples -- he, for example, blows off visiting the entire South after gathering a couple of samples from some people from Atlanta that he meets in a hotel bar in San Francisco.  The whole second section is self-indulgent and slow.  And, since he doesn't have very many samples and he's wasted a ton of pages in the second section, the third section feels both rushed and incomplete.

Unfortunately, I can't recommend that you just read the first section and skip the other two, because he buries some very interesting things in with the tedious.  Skimming is your friend, here.

My takeaways:  I need to investigate African Ancestry, a DNA testing company; I wonder how it compares to FamilyTreeDNA, 23andMe, and  I need to read up on some of the ethics issues involved in Native American and African American DNA testing and make sure they get covered in our DNA SIG meetings at the library.

...And I should never invite Bryan Sykes and Spencer Wells to the same party.  Although it's covered in pretty language, both of them say some pretty nasty things about each other in their books.