As I’ve indicated on several occasions, about a year ago, I used FamilySearch to start collecting biographical data on a random pool of 5th New Hampshire veterans (all of whom were original volunteers from when the regiment was first organized). I first started doing so because one of my students was interested in doing a statistical analysis of the life outcomes of these men. Although she did not proceed with the project, I continued because I thought collecting data would provide some important insights into the experiences of the soldiers who fought with the regiment. Moreover, as I kept going, I realized that my investigation was uncovering some interesting stories that I could pursue in the future. By the time I finished collecting data over a week ago, I had obtained information about 403 veterans. (I refer to them now as “400,” partially for convenience’s sake and partially as an allusion to the Spartan “300.”) That information is now transcribed in a 579-page, single-spaced document in 10-point Calibri font.
Just a couple of days ago, I finished transferring much of the data to an Excel spreadsheet so that I could sort and search the information in various ways. I am now ready to start looking at what the numbers reveal about these veterans.
How the “400” Were Selected
Several years ago, I got my hands on Augustus D. Ayling’s Revised Register of the Soldiers and Sailors of New Hampshire in the War of the Rebellion, 1861-1866. The Revised Register has a brief service record for every New Hampshire serviceman who fought in the Civil War. In the case of the 5th New Hampshire, Ayling’s book has the records of some 2500 men. A couple of student researchers and I (thank you Greg Valcourt ’19 and William Bearce ’19) spent a number of months transferring the information for these soldiers to an Excel spreadsheet.
When the student who was interested in doing a statistical analysis of the 5th New Hampshire’s veterans approached me, I thought it would make sense to limit the study to the original volunteers who enlisted in September and October 1861. First, they would be easier to find on FamilySearch since the vast majority were native-born and very few deserted. Second, I had to limit the project in some fashion, and the 1000-odd men who were the first members of the regiment seemed easier to deal with than the entire 2500 who passed through the unit’s ranks during the war.
I created an Excel spreadsheet with information exclusively associated with the original volunteers and, moving in alphabetical order, started collecting biographical information about every man on that spreadsheet who survived the war. After assembling about seven or eight biographies, I realized that I had undertaken an enormous task. I asked one of my colleagues in Sociology who has an affinity for statistics if I could take a shortcut. Would it be possible, I wondered, to collect information about a smaller pool of men (randomly selected from among the 1000) that would still yield statistically useful information? She thought the minimum size of the pool should be about 300. To get to that number, we agreed that I should proceed by alphabetical order, picking every other soldier on my Excel spreadsheet. If I landed on somebody who had not survived the war, I would have to go to the next person who had survived. And that’s what I did, except for one thing: I kept the information on my spreadsheet about the first seven or eight survivors in a row that I collected before I spoke to my colleague. The whole process, though, seems sufficiently random to me. And as we shall see, for a variety of reasons, we should keep in mind that the data are not exactly characterized by great exactitude.
What I Learned about the Data
So why are the data not characterized by great exactitude? Perhaps the most important reason is that most of the information was self-reported, and self-reported information is unreliable. For example, men often changed their names or misrepresented their occupations, and when you throw in careless (or overworked?) census-takers into the mix, matters become very complicated. Among other things, people in those days often seemed pretty cavalier about reporting their age with any accuracy. And if there’s one thing I discovered, throughout their lives, men lied about their age. They lied to the recruiting officer because they were either too young or too old to volunteer for the army. They lied when they got married, especially if their 16-year-old wife was less than half their age. They lied when they got old so that they would appear more venerable and respected. They lied for reasons known only to themselves.
For that reason, I often had to make educated guesses about when men were born (especially since I found so few birth records from this period). The documents I located were not always helpful because those compiling them often did not ask for a man’s birthday—rather, they asked for his age. That mode of proceeding led to certain problems. If a man was truthful when he told the recruiting officer that he was 21 in 1861, he could have been born in either 1839 or 1840.
This is the top portion of a page for the 1860 Census in Ward 2 of Worcester, MA. Freeman Hutchins (aka Moses Freeman Hutchins), who appears on this page, served only briefly with Company E of the 5th New Hampshire; he was discharged disabled on January 10, 1862 after having been with the regiment for less two and a half months. Later in 1862, Hutchins served with the 12th New Hampshire for just under three months before he was discharged disabled again. Here we see the one of the biggest problems with census records; aside from the fact that all of the information was self-reported, the form only indicates the ages of the respondents on the date of the census (in this case, June 14, 1860). Hutchins claimed he was 23 on that day. Was he born in 1836 or 1837? Or was he lying about his age and born in some other year?
For that reason, my unit of measurement was years instead of anything more precise, and that meant ages often got rounded up. For example, a man born in December 1830 who died in February 1896 was only 65 years and 3 months at the end of his life, but since I was dealing in years, I had to enter his lifespan as 66. I suppose this phenomenon may have exerted some upward pressure on my calculations regarding lifespan, but I found accurate birthdays so infrequently that there was nothing I could do about it. At the same time, I rounded to the month for the soldiers’ length of service; it didn’t seem to make sense to me to use a more accurate unit of measurement.
There are other problems too. Much of the medical information in my records is suspect for a variety of reasons. For example, it’s unclear how determination of cause of death was made in the late 19th and early 20th centuries. And, of course, medical science was not what it is today. Moreover, it was not always clear how good country doctors in New Hampshire were at distinguishing one illness from another, let alone the veterans themselves or the census-takers in 1890. Moreover, the same terms were used differently over a 100 years ago. And yet, despite these problems, the medical information is still useful in some ways (but that’s for a future post).
This is the entry for Stephen L. Stearns in the records of the Eastern Branch of the National Home for Disabled Volunteer Soldiers in Togus, ME. Stearns, who was admitted on September 5, 1889, had served in Company G, 5th New Hampshire Volunteer Infantry, from October 1861 to November 1863. According to this entry, Stearns claimed he contracted diabetes at the Battle of Fair Oaks. While such a claim may seem odd, some research suggests that trauma or stress can trigger the onset of diabetes among those who are predisposed to the illness.
Finally, there are holes in the data. Some types of data were easier to collect than others, and some men proved easier to track than others (deserters, immigrants, and those moving far away from New England often proved most difficult to locate). But I did find enough information about enough men to make some generalizations.
This is Only the Beginning
My collection of information and my analysis of it are works in progress and will be so for quite some time. Yes, I’ve probably made some errors in transcription and similar such mistakes. I caught some of these in the as I transcribed information to the Excel spreadsheet that contains information about the “400.” But beyond that, in looking at what I’ve collected, I’ve made the obvious realization that data prompt as many questions as they answer. Data alone means little without interpretation, and interpretation either requires bringing different analytical tools to bear or the collection of even more data for the sake of contextualization. (For example, with the help of several more students [Steve Hanabergh ’21, Will Small ’21, and Connor O’Neill ’22] I’ve started collecting data about soldiers from the 5th New Hampshire who died from illness or combat to see if they differed in any noticeable way as a group from the men who survived.) As I wrestle with the data in the next series of posts, you’ll see me thinking “aloud” and thrashing about in one direction or another as a try to find the message in the noise.
When I completed the spreadsheet with the data for my 403 men, I exclaimed to my wife, “I’m finished!” But this is not an end; it is really a beginning. Over the coming weeks and months, I will play with the data on this blog and see what they reveal. So why not follow me on the journey?