Category Archives: legal information

Improving Access to Notes and Comments in Digital Collections

Original footnotes and hyperlinks added to version published in Vol. 38, Iss. 2 of the ALL-SIS Newsletter (Winter 2019).

Aaron Kirschenfeld, Digital Initiatives Law Librarian at the UNC School of Law’s Kathrine R. Everett Law Library

Law journal case notes, statutory notes, and comments (collectively called notes and comments) can be an excellent resource, especially when starting research in a new or unfamiliar area. Notes and comments are rich with footnotes that have been cited carefully. They are descriptive or explanatory in nature and mercifully short. In an era when many law journals are available in online digital collections, I have come to think of notes and comments as a vast, disorganized set of encyclopedia entries that have been hiding in plain sight. This article tells the story of how our library went about revealing some of them.

When we launched our BePress Digital Commons institutional repository in early 2016, we sought to add unique materials that our patrons, both at the law school and across the state, would value. Following the lead of other law libraries, the first collections we added were backfiles of law journals affiliated with the law school that we purchased from Hein. As the Digital Initiatives Law Librarian, I managed the process for getting the documents and descriptive metadata into the repository.

Although I am a member of our Collection and Technology Services department, I am also a reference librarian, with regular desk shifts, teaching duties, and the like. After our North Carolina Law Review collection was published to the web, I fielded a reference question for an article that appeared to be missing from the repository. I discovered, to my deep embarrassment, that while the article – a student note – was in fact on the repository, it was not easy to find.

I worked to figure out the scope of the problem and its cause. For some reason, individual notes or comments were not indexed at the item level in the data we received from Hein. Instead, they were chunked together in large files, sometimes up to 200 or more pages long, that were titled “Notes and Comments.”1 Researchers with a citation in hand could of course navigate to the proper “Notes and Comments” document and scroll to the appropriate page, but they were unable to find it otherwise.2

The individual notes or comments had titles that essentially amounted to rich subject headings. For instance, “Real Property — Easements — Prescriptive Acquisition in North Carolina,” a note written by John G. Aldridge and published in 1966 at 45 N.C. L. Rev. 284, does a great job of describing itself. The Supreme Court of North Carolina subsequently cited it in 1974, but there was no way to search for it by title, subject, or author either in our repository or on HeinOnline. As a result, the data was not indexed in search engines, either. The notes were available online, but they were largely inaccessible.

We decided that while this was not a particularly urgent problem, we wanted to do what we could to help people find these notes. For one, the notes seemed to be about topics still relevant not only to scholars with an interest in the law’s development, but also, in some cases, to practitioners looking for summaries of law that had not changed much over time. So we set about our work to create metadata and individual digital objects for each note or comment.

Over the course of six months, I supervised a reference librarian, Allison Symulevich, as she accurately hand-keyed much of the descriptive metadata for the notes and comments. We had some false starts prior to Allison joining the project, and the final product benefited greatly from her knowledge of legal sources and attention to detail. I also was able to employ a student worker, Christopher Bishop, to carefully split the large PDF files into individual files.3

By May 2018, we transformed 188 “Notes and Comments” files from the North Carolina Law Review into more than 1,700 individual items, each with its own descriptive metadata and PDF file.4 The notes and comments stretched from Volume 5 to Volume 62, or from 1926 to 1983, covering 57 years of student contributions.

We have been able to learn a bit about the notes and comments since we completed the project. As one would expect, many of the authors of student notes went on to play important roles in the profession. A sample of twenty-five newly described documents turned up notes by a future U.S. Court of Appeals judge, a future North Carolina Court of Appeals judge and state legislator, and several prominent attorneys.

Likewise, based on download counts, access to the notes themselves has increased. Since September 2017, the newly described notes have been downloaded more than 18,000 times. In the same period, the old chunked-together notes and comments documents were downloaded only 748 times. It turns out that good metadata really does increase accessibility, at least in this case.

Finally, I was able to work with Hein to transfer our work into the HeinOnline Law Journal Library.5 That work was just completed in late October, and all subscribers will now be able to access the newly indexed notes. The company has also identified a little more than 1,000 additional “Notes and Comments” sections in other law journals, and they are in the process of putting together a production schedule for work to describe and separate those items.


1 I initially suspected that this was caused by the notes never being indexed in Index to Legal Periodicals, but this is incorrect. Notes were indeed indexed. Representatives from Hein concluded that the omissions occurred due to “an indexing decision made many years ago [during the late 1990s] which may have resulted from a lack of table of contents provided in the print edition that was digitized.” For more information on the history of legal periodical indexing, see Richard Leiter, A History of Legal Periodical Indexing, 7 Legal Ref. Servs. Q. 35 (1987). For more on the ILP’s indexing policy, see, e.g., Miles O. Price & Harry Bitner, Effective Legal Research 279 (3d ed., 1969).

2 For an example of a “chunked together” document still on the repository, see, e.g., Notes and Comments, https://scholarship.law.unc.edu/nclr/vol45/iss2/4/ [https://perma.cc/847S-W7BE].

3 For more technical detail, there is an archived video recording of a talk I gave on this project at CALICon18 in June 2018, Notes & Comments: Unique Resources for the Law School Institutional Repository, https://youtu.be/Tk4V0yRt3dA [https://perma.cc/4V28-EBPU].

4 See North Carolina Law Review, https://scholarship.law.unc.edu/nclr/all_issues.html [https://perma.cc/VR34-JCSD].

5 This collection contains more than 2,600 law journals, many with article level metadata since inception. See Law Journal Library, https://home.heinonline.org/content/law-journal-library/ [https://perma.cc/6G6W-7JLP]. For more information on the origins of HeinOnline, see Joe Gerken, The Invention of HeinOnline, 18 AALL Spectrum, Feb. 2014, at 17.

Crushed by bepress

At the outset, let me state explicitly that these are my opinions, insofar as one can develop opinions of one’s own. There are a lot of assumptions buried in what follows, many of which I am (painfully) aware of. Also, these are initial impressions, but I stand by them as what they are. [Some edits made for clarity.]

The Press

The Press

Elsevier acquired bepress, the company that offers the Digital Commons repository platform, which my institution purchased in 2015 shortly after I was hired, and to some degree at my urging. I am an academic law librarian at a large public institution. I have a little more than two years of experience in the field and in this job. Many academic law libraries have Digital Commons repositories.

At the time we signed on to Digital Commons, we had no mechanism for posting well-described full-text documents to the web. At first, we focused on uploading articles already published in student-run law reviews at our institution and then planned to move on to our faculty members’s scholarship (both published and drafts / working papers / pre-prints). Later, we would add unique collections of primary law documents. In the course of 18 months, I managed the repository and uploaded 6,000+ documents, which have been downloaded 160,000+ times.

Digital Commons offered a mechanism for uploading many documents at once and promised an easy-enough way of mediating document intake. The wider institution was working on a large project with a homegrown repository system that would meet the needs of all university faculty wanting to comply with the (voluntary) OA policy. Our library does not work much with the campus libraries — it is, like many law libraries, an independent entity on campus. Law libraries serve legal scholars, law students, and members of the public. Law publishing patterns, including those for scholarship, are considerably different than those in other areas of study or work, as are the needs of our discipline’s scholars.

I have about eight or ten different job duties, one of which is the maintenance of the repository. Overall, I’d say it makes up about 25% of my job, which also includes teaching advanced legal research, providing reference services to library patrons, working with the OPAC, etc., etc. I have some competence in computing — I know basic python scripting, which I have taught myself since starting this job. I know my way around the command line, too. I am a legal subject-matter specialist (and, not that it matters, a lawyer). And I am terrified.

First, it’s my responsibility to make sure as many people as possible have access to work produced by our faculty and student journals at the lowest cost possible. At the time I advocated we use bepress, I figured that the fee we pay for Digital Commons plus my part-time labor were acceptable trade-offs, in cost, for being able to make a lot of information free for the public to access. Maybe I was right, maybe I was wrong, but the fact is we got a lot of documents up quickly and people accessed them. Digital Commons was a fine solution — not perfect, but functional — but it was, and is, a for-profit publisher. With the Elsevier acquisition, I don’t know if it is as good a solution any more, for any number of reasons: increased cost, negative perception by the faculty, deprecation of the product….

But what are our options otherwise? I see two: (1) my law library joining in with the campus project, where, because of our difference as a law library, we will not likely get what we want (ease of use, control, etc.) and we will compete for attention with other, flashier scholarly areas or (2) joining some kind of pan-law-library consortial effort to build “our” own pre-print and repository management system. For the first, let me say that I admire the hard work and progress that has been made on campus — the university-wide repository is an amazing work of collaboration and engineering. For the second, let me say that I admire the consortial work that has been done preserve legal documents, share resources, and yes, the new partnership with OSI known as LawArXiv, which is still very much in early days.

The question I have is this: will it really be possible to muster the will, buy-in, and know-how to make either (or both) of these non-commercial options worthwhile for my library? I imagine that both will be costly in terms of money and time, and that neither will be as easy or as customizable to my needs as Digital Commons is. But neither will have the stink of Elsevier that is anathema to so much of the academic and legal academic community. I understand that my hard work — hours and hours of creating descriptions and uploading documents will be co-opted and sold back to me by some multinational corporation. That’s life; things are hard, the rich get richer, and we struggle to stay afloat. But I care about my patrons — the public — having access to information in a way that works for them, not for me or for my ideology.

So while the gotchas and the schadenfreude and the quick answers are fun, the truth is that we have a complex problem on our hands, and that there’s lots of work to be done to meet the public’s need not only for legal scholarship but also for other types of legal information. It’s not fun, it’s just hard.

File Naming Conventions for Court Documents

Introduction

Summer is coming, and I’ve embarked on a research project concerning documents from the U.S. District Courts. As I was consolidating my work this morning, I grew frustrated with the naming conventions (or lack thereof) I’d been using while saving PDFs to drive. And since I have been rather, er, interested in filenames recently, it seemed as good a time as any to collect my thoughts. I will begin with the caveat that while I am a librarian and professor of legal research, and a successful applicant to the N.C. bar, I have never worked on a case as an attorney, so I am hoping to solicit some feedback in the comments below. I plan to keep this post updated with new information as it becomes available. So, let’s begin!

Federal Courts

U.S. District Courts & Bankruptcy Courts

An Example

The first federal district court document I found was a complaint from the Middle District of North Carolina, filed in 2016 and assigned the docket number 1:16-cv-00988. Here are two examples of how I would name this file:

NCMD_1-16-CV-00988_0001.pdf
or
NCMD_1-16-CV-00988_0001_Complaint.pdf

Explanation

Because the 94 federal district courts all use the same docket numbering pattern, it is essential to start the file name with an abbreviation for the court. I chose the abbreviation that the U.S. Courts uses for its websites — i.e., http://www.ncmd.uscourts.gov/ — which I like because it should always make it easy to sort by state alphabetically. Then, an underscore, followed by the docket number divided all by single dashes at the appropriate points. (Incidentally, this piece of the citation is considered “valid” by Bloomberg Law’s docket search feature, which is an added bonus.) Finally, the docket number, which should always be four characters, and prepended with as many leading zeros as is necessary to achieve this. If it’s important for you to be able to tell, at a glance, what the document is — i.e., an order, a memorandum in support of a motion, etc., etc., you should develop some kind of vocabulary for naming the document and add that after the docket number.

This naming scheme will allow you to easily parse your filenames and get them into a structured data format like a .csv file or Excel spreadsheet, if you’re going to be doing any analysis; greater structure will also help if you use a citation management system while writing. Of course, you can simply sort your directories by filename alphabetically and likely get a good picture of what you’ve got.

Creating Directories

For cases in the federal district courts, I’d also recommend creating directories for each case or matter that is being studied, but only if that case is particularly significant to your research or you have collected a lot of documents related to the case. I’d recommend using party names and eschewing the lowercase directory naming convention. For the document I’ve been using as an example, the directory would be Calloway-v-Moore and you would place documents from subsequent appeals or docketings, and presumably would also be useful if the case had been removed from state to federal court or if it had been transferred.

Bankruptcy Courts

My solution for documents from Article I U.S. Bankruptcy Courts follows the exact structure as above, using the url abbreviation from the federal court system website. A case from the United States Bankruptcy Court for the Eastern District of North Carolina would begin with NCEB. I will try to compile a simple chart of all the abbreviations used in federal court urls.

U.S. Court of Appeals

An Example

There are thirteen courts that make up the circuit courts of the U.S. Court of Appeals. Eleven are numbered geographic circuits, which leaves the DC Circuit and the Federal Circuit. I find it considerably easier to name files from these courts, however, there is a wrinkle concerning the case citation. Here are two examples of opinions I have collected:

04CA_1956_239-F2d-502.pdf
and
05CA_2016_15-50559_Crose-v-Humana.pdf

Explanation

The first component is rather self-explanatory — a two digit designation for the numerical circuit followed by CA. For the DC Circuit, I recommend DCCA and for the Federal Circuit, FCCA because, well, I am allowed to be arbitrary and there doesn’t seem to be a likelihood that either abbreviation will conflict with other abbreviations used in the legal field. Also, each is four characters — consistency! [Though Rachel Gurvich, a colleague and legal writing professor at UNC, pointed out that she uses CTAF, which is the four-character abbreviation used by Westlaw for Federal Circuit cases. Just pick one and stick with it!]

Next, we have a four-digit year notation for the date the opinion was filed. Following that is the case number or case citation, which will depend on what is available and perhaps on your own intellectual bent. Opinions from the U.S. Court of Appeals are published in West Publishing Federal Reporter, but, more recently, have been published directly by the court on its website or collocated and served from a site like govinfo.gov. If you have a document from the Federal Reporter (now in its third series, so F, F2d, and F3d will be appropriate abbreviations), I recommend following the Volume-Reporter-FirstPage model of citation. If you have any other federal appellate opinion, I recommend using the case number followed by an underscore and the case name format discussed above. Of course, appending the case name might be helpful in the case of an opinion in the Federal Reporter.

Again, the idea here is not to make your filename look like a Bluebook citation, it is to make it easier for you to draw meaningful information from the filename alone and to facilitate parsing of the data you’ve collected for any number of important purposes.

U.S. Supreme Court

Naming a case from the U.S. Supreme Court is rather straightforward — US followed by the four-digit year of decision, the case number, and some kind of descriptive phase about the case. These two names, representing recent cases in my collection, serve as examples: US_2014_13–354_Burwell-v-HobbyLobby.pdf and US_2016_15-108_PR-Sovereignty.pdf

State Courts

North Carolina

There is more on this to come in a subsequent post, but in the meantime, check out some data that was pretty easily parsed out from the information provided by the N.C. Administrative Office of the Courts concerning N.C. Court of Appeals opinions.

I will only discuss naming files from state appellate courts here, and will use North Carolina as an example. For state trial court documents, I’m afraid we will need another extremely long and boring blog post.

Intermediate Appellate Courts

Here is an example of a filename for a document from a state appellate court:

NC_CA_2017_16-899_AttyFeeDispute.pdf

This case, from the North Carolina Court of Appeals, was interesting to me because it involved a quantum meruit claim for an attorney’s services. It’s the only 2017 case in my collection about an attorney fee dispute, and the party names weren’t significant to me, so I used a descriptive phrase. Pretty straightforward. If your state has multiple intermediate appellate courts, I recommend coming up with a two-character abbreviation. Please leave it in the comments below! For those of you regularly practicing in the nine states lacking intermediate appellate courts, I have just wasted a minute of your time.

Courts of Final Resort

I recommend using SC as the abbreviation for North Carolina’s Supreme Court, rendering a file name as such: NC_SC_1980_299-NC-360_Ragland-v-Moore.pdf

Here, we have an opinion from 1980 — but, because of the nature of the publication of state court opinions, we have a complication. Most states have both an official reporter for decisions and have decisions published in West’s National Reporter System. I suggest that, when possible, you use either the case number or the official reporter to use in your filename. The reason is that both citations are more rich in data at a glance than the regional reporter citation. There’s a lot to say about medium-neutral citation here, but my purpose isn’t to give a lesson in that whole mess, but rather to suggest an efficient and useful method of naming files so you can find them easily and analyze what you have.

A Note About Case Management Systems

Professor Gurvich also suggested, rightly, that a given firm or organization’s case management system will add information, likely a client ID and/or a matter number, to most documents. One thing that is great about case management systems is that they obscure the small details of document metadata and make things easier or more intuitive to find. In many ways, this is the point of software and information design — to make the user’s experience more fluid (and to prevent inconsistencies in data entry) so she need not worry about, say, following a complex system of filing naming perfectly.

The file-naming system described above is suggested for those without access to such a system, or those, like scholars, who use court documents in a way different from lawyers or judges. I’d suggest having a directory or file for each client, and a sub-directory for each matter, and then place named files inside, to roughly recreate the strength of a good automated document management system.

That said, I still think there is plenty of value in organizing one’s files according to a consistent scheme. This one is built with the idea of being both human and machine readable — that is, providing information to a user both at a glance and after some kind of software manipulation. Likewise, file names are indexed by all modern operating systems, meaning that using your computer’s search function will be faster, especially if you append descriptive phrases, such as party or document names, at the end of your file name.

And here, for your viewing pleasure, is a list of files of case documents:

Case Naming Conventions