New Case where Police Use Hash to Catch a Perp and My Favored Truncated Hash Labeling System to ID the Evidence

August 17, 2008

Part of my discipline as an e-discovery specialist is to try to read (or at least skim) every published opinion on the subject. Lots of attorneys specializing in this area do that. But there is one other type of case I also read, every opinion that uses the word “hash.” No, I do not need help from Narcotics or Overeaters Anonymous. The kind of hash I am addicted to is purely algorithmic. This hash comes in many flavors, but the best known, and the ones usually employed in e-discovery, are called MD5 hash, SHA-1 hash, or the latest and greatest, SHA-2 hash

As I explain in my blog Hash page, hash is the mathematical foundation of e-discovery and the most powerful tool of any forensic investigator. It reveals the unique mathematical fingerprint of every computer file that allows for perfect identification and authentication of electronic evidence.  I became fascinated with the powers of hash a few years ago, and ended up writing a lengthy law review article on the subject. HASH: The New Bates Stamp, 12 Journal of Technology Law & Policy 1 (June 2007). A few months ago I wrote a blog on the article called The Days of the Bates Stamp Are Numbered, talking about some of the more recent developments in this area of the law, especially the shift from Tiffing and linear flat file Bates stamping to native file hash marking.

In the process of researching the original law review article, I am pretty sure I read every legal opinion and legal article ever written that  mentions hash. I also read a few scientific and cryptological articles as well, most of which I did not really understand. Having put that much time and effort into the subject, I try to keep up by reading every new legal opinion or article mentioning hash. That is why I have a standing search for all cases using the term, and automatically receive a copy of them by email as soon as they are published. I can be in the middle of dinner and my blackberry will buzz alerting me of a new hash case. Lest you think that’s a tad weird, I am willing to bet that there are a few other hash enthusiasts out there, Craig Ball comes to mind, who do the same thing. (See Craig Ball’s excellent article “In Praise of Hash” at pg. 52.)

Hash and Child Pornography

Most of the new hash cases I see have nothing to do with e-discovery per se. Instead, they are usually criminal law cases, typically cases involving one of the most disgusting of crimes, child pornography. Police have been using hash to catch perps in this area for years. Hash is an effective tool for this because it allows police to know if certain child pornography is located on a computer, usually videos or still photos, by looking to see if the hash values for these files are present. That is a bit of an over-simplification, but suffice it to say that there are lists of hash values that are known to be associated with computer files which are unquestionably child pornography. New York Attorney General Andrew Cuomo explained the process in a press release in June 2008 announcing a deal with major Internet providers to block major sources of child pornography: 

As part of the undercover investigation, the Attorney General’s office developed a new system for identifying online content that contains child pornography.  Every online picture has a unique “Hash Value” that, once identified and collected, can be used to digitally match the same image anywhere else it is distributed.  By building a library of the Hash Values for images identified as being child pornography, the Attorney General’s investigators were able to filter through tens of thousands of online files at a time, speedily identifying which Internet Service Providers were providing access to child pornography images.

U.S. v. Warren

I recently received a new hash case alert from a district court in Missouri. U.S. v. Warren, 2008 WL 3010156 (E.D.Mo. July 24, 2008). A quick review showed it was yet another child porn case, so I did not think much about it. I just added it to my reading list for more careful study later, just in case there might be something special about it. When I got around to reading Warren yesterday, I was very pleasantly surprised, as this was indeed a special case.

Warren is a case considering and rejecting a motion to suppress evidence, namely computer video files of underage teens having sex. The motion to suppress was based on a series of hyper-technical challenges to the affidavit which the St. Louis police submitted to the judge to receive a search warrant of defendant’s computer. The affidavit explained how the police had searched the Internet for files “whose digital SHA-1 value was identical to that of a file known to contain child pornography.” They found a computer with an Internet Protocol address of 70 … 167 offering to share one such known file, and then subpoenaed AT&T to get the physical address of the subscriber with that IP address. The computer was located in Affton, Missouri.

The police detective’s affidavit explained how the hash values and offer to upload established “that a computer in Missouri was ‘offering to participate in the distribution of known child pornography.’” Based on this affidavit, the judge found probable cause to issue the search warrant of the computers located in Warren’s home. The police then went to his home, found no one there, forced entry, and seized his computer. Warren himself later came along, and, foolishly enough, voluntarily came to the police station, waived his right to counsel several times, and spoke at length to the police. The opinion includes extensive excerpts of the taped interview, which Warren later argued was made in violation of his right to legal counsel.

The defendant’s technical search warrant objections forced the court to delve into many of the characteristics and evidentiary properties of hash. For that reason alone, the case is useful to any practitioner trying to better understand the subject. But what is really special about the case, at least for me, is the system of hash file identification used by the court to identify the offending video tape at issue in this case. That video computer file was the key piece of evidence, the “smoking gun.”

Six-Place Hash Truncation Naming Protocol

The opinion by Magistrate Judge David D. Noce in Warren is unusual and special because it is the first case to use the truncated hash value labeling system I proposed in HASH: The New Bates Stamp. My article was not mentioned, and apparently Judge Noce was not aware of it. He used the six-place hash truncation system I proposed in my article because it was, in his words, “convenient” to do so, and because the detectives had used that system in their affidavits and testimony. I doubt the police detectives had read my law review article either, which makes their use of the abbreviation system all the more important. It shows that it is a natural and reasonable thing to do, although this is the first time it has been utilized or mentioned in a legal opinion.

So what is the six-place hash truncation system which I proposed that these Missouri officials are now in fact using? Before I can answer that, I have to go into a little more depth about hash and Bates stamps. HASH: The New Bates Stamp not only explains hash and its importance to e-discovery, it also argues for the legal profession and e-discovery industry to adopt a new type of electronic document naming protocol that uses hash values, instead of sequential numbering, to identify electronic evidence. I argue that the time has come for the legal profession to abandon Nineteenth Century Bates stamp paper mentality, and adopt Twenty-First Century ESI hash mentality. I proposed that sequential Bates stamps be replaced by non-linear, intrinsic hash values.

The hash values would not only identify ESI, they would authenticate it too, something the lowly Bates stamp could never do.  But the problem with using hash values to identify ESI, instead of Bates stamps, is that hash values are too long and awkward for the human mind. Here is what a typical forty place hexadecimal SHA-1 hash value looks like: 2B37BC6257556E954F90755DDE5DB8CDA8D76619.

Police detectives, lawyers and judges cannot go around describing computer files used as evidence with such long alphanumerics. It is too cumbersome a name to replace the Bates stamp. So my common sense proposal, which Judge Noce in Warren calls “convenient,” is to only use the first and last three places of the hash value, instead of all forty. So the hash value above becomes the much more manageable 2B3 … 619. That truncated hash value becomes a pretty good document name, and, in my opinion and that of many others, should replace the arbitrary Bates stamp.

Turns out that the detectives in Missouri were already following this six-place truncation protocol at the time my article was published in June 2007. Perhaps they and other law enforcement agencies have been using this system for years. I do not know for sure, although I doubt it has been a widespread practice. I have talked to many e-discovery forensic experts about the hash naming proposal over the past two years. Many of these experts did police work before going into e-discovery, and none ever mentioned having done this before. Also, it certainly does not appear in the legal literature on the subject, that is, until U.S. v. Warren.

Hexadecimal Values v. Base32 Number System

At first, I was disappointed to see that Judge Noce’s introduction of the truncated hash value naming protocol was flawed with two obvious technical errors. See if you can catch them:

The search turned up a list of files, including one with a 32-character alpha-numeric SHA1 designation of “H4V … UTI.” Fn4

FN4 - For convenience, in this opinion the SHA1 value set out in full in the search warrant affidavit will be referred to as “H4V … UTI.” The affidavit defined the term “SHA1” (also known as “SHA-1”) as being a mathematical algorithm that uses the Secure Hash Algorithm (SHA), developed by the National Institute of Standards and Technology (NIST), along with the National Security Agency (NSA) . . . Basically the SHA1 is an algorithm for computing a condensed representation of a message or data file like a fingerprint.

Warren at *1.

First of all, the SHA-1 hash generates a 40-character hexadecimal string, not 32-character. The other kind of hash, MD5 hash, is the one that uses a 32 character string, not SHA-1. For this reason, my first reaction was that the Judge, or police, mixed up the two different types of hash, and meant to say 40 characters, not 32.

But then there seemed to be yet another, even bigger mistake. The letters H V U T and I should not have been in the hash value name. The values generated in e-discovery work to represent SHA-1 and MD5 hash are always hexadecimal. That is a numerical system with a base of 16. This is typically represented by the numbers 0–9 for the first ten values, and A, B, C, D, E, and F to represent the last six, for a total of sixteen. In other words, a hexadecimal value does not employ any letters after F. Yet, the so called SHA-1 alphanumeric stated in the Warren opinion uses the letters H, U, T and I: “H4V … UTI.”

I thought the police or Judge Noce must have messed things up, but I also seemed to remember reading somewhere that were other ways to express hash values, and anyway, I am always very careful before I tell a judge that he or she is wrong. So doing a little online research, I learned that there are indeed other ways to display hash values using different binary based number systems, typically the 32 base or 64 base number systems. Base32 is defined in IETF RFC 3548, as using the characters A-Z and 2-7. While Base64 is defined in IETF PEM RFC 1421 as using the characters A-Z, a-z, 0-9, / and +.

My Online Investigation of Base32 Hash Math
Led to a Shocking Discovery

Coming back to the Warren opinion, the hash values “H4V … UTI” are not hexadecimal, but they could be either Base 32 or Base 64. At this point, I did a little more online research about Base32 hash, and quickly found that there are many websites where you can locate music and videos to download based on their hash values. Almost right away, by simply using Google, I located a site where you can find media to download based upon their SHA1 Base32 value. It then took less than a minute to find the web page where the Base32 SHA-1 hash values were listed that began with “H4V.” That is how all of the media on the site was listed, in numerical order based upon the first three numbers of their Base32 hash values.

There were 83 entires on the webpage whose hash values began with H4V. The site included listings of music and videos ranging from Beethoven’s Symphony No. 9 to a video of Lee Trevano’s Golf Instruction. One video listing which was 11.1 MB in size had a disturbing title that suggested it could contain the kind of porn referenced in Warren. It was dated May 29, 2003. I clicked on its hash value button and saw that the full SHA-1 hash value for this video was H4VIBLSKAZ477WRTKH7IURE6NXEDCUTI.

When I saw that hash value, it shook me up. The first and last three values exactly matched the hash described in Warren: H4V … UTI. My academic investigation of the mathematical properties of hash had led me right to the smoking gun in Warren! I knew from my article, and the research of Bill Speros described in footnote 168, that this match of the first and last three values meant there was a 98.6% probability that this was the exact same file referenced in Warren.  Mr. Warren was charged with a felony for distributing this same video. I think it is a crime to even have it on your computer.

I do not know for sure if it is the same file, since the Warren opinion nowhere states the full hash value, but in view of the description of this video, it is just too much of a coincidence for it not to be.  It was astonishing on many levels to see just how quickly you can find a file like this on the Internet, simply by knowing the first three hash numbers. 

It is probably not possible to actually download or view the file from this website. I do not really know for sure, since that would involve clicking on this file, which I was not about to do. But when I clicked on the link for Beethoven’s Symphony No. 9, a piece of media which I do not find morally reprehensible, it took me to another web page. This page had links to other computers where you may in fact have been able to download Beethoven’s music. (I did not try, recognizing that might be a copyright violation.) At that point, the referring website included a statement that it “ONLY HAS INFO ABOUT FILES, AND DOES NOT OFFER ANY FILES FOR DOWNLOAD.” Still, if any law enforcement agency wants to contact me for the full website address, including Cuomo’s group, I would be happy to provide it. It is really very easy to find, and so I assume the proper authorities are already well aware of this site and its hash values, or lack thereof. I am certainly no police officer, and even if I was, I would not have the stomach for this kind of investigative work. Reading the email of parties in civil suits is about as horrid as I can handle.

Judge Noce Was Right

This little investigation proved to me that Judge Noce and the St. Louis police were correct. There is a SHA-1 hash that has 32 places, not 40, and it can use the whole alphabet, not just A-F.

The hash value H4V … UTI is indeed a correct first and last place truncation of a full SHA-1 hash value. But it is a SHA-1 hash that is expressed in Base32, not hexadecimal. Although the hash values used in e-discovery are almost always hexadecimal, the hash values used in “Peer-to-Peer” websites include a variety of different numerical systems, frequently including the Base32 system.

In addition, in my brief investigation of the P2P webs, I learned that countless P2P type websites now commonly use the first three places of hash values as a convenient shorthand naming system. For all I know, the “perps” may also. As Judge Noce says, it is the convenient thing to do. So when will the e-discovery vendors start doing so too?




What Game Does an e-Discovery Team Play?

April 13, 2008

this game is not permitted in litigationHide the ball is certainly not the game for an e-Discovery Team to play. Some people think that is what discovery is all about, and in the world of paper discovery, years ago, there was some truth to that. But not today, and certainly not in electronic discovery. It may be tempting to some, but if you play hide the ball in e-discovery, and get caught, you may not only lose the case, but you may lose your job, and maybe even your license. It is never worth it, just ask Qualcomm’s lawyers. Instead, an e-Discovery Team plays a series of games that culminates in throwing the ball to the other side, not hiding it.

Before you can get to the final throwing step of production of electronically stored information (”ESI”), there are a series of preliminary games to be played. Here is how I summarize the e-Discovery team playbook:

    basketball

  • Find the Ball
  • Save the Ball
  • Pick up the Ball
  • Shrink the Ball
  • Clean the Ball
  • Aim the Ball
  • Throw the Ball

The first game of find the ball is called the Identification step in the standard industry language of the Electronic Discovery Reference Model(”EDRM”). By looking at the standard EDRM model below you can quickly see how each game represents a basic step in the EDRM.

EDRM Standard Model for e-Discovery

Find the Ball

Finding the ball is far easier said then done. For most companies, the problem derives from storing terabytes of data. Imagine a string of warehouses storing a billion basketballs, and you have to search through and find the one ball among them autographed by Michael Jordan. Unless the Team is well established, you probably do not have an accurate, detailed, up-to-date map of all of the warehouses. You probably have only a vague idea where this one basketball might be located. It might be somewhere in a centralized bin, or in any one of dozens of other locations, including closets in employee homes, or off-site Internet storage lockers. It might even exist only in a shrunk down version, hiding in the pocket of one of a thousand employees; perhaps in their thumb-drive, or iPhone. Moreover, ever day a thousand basketballs are destroyed (hopefully not the one with Jordan’s autograph), and twelve hundred new ones are added. Yes, it is a very challenging game indeed

To make matters worse, you are never sure exactly what balls you are looking for, especially when the game first begins. You may have to guess, from a vague complaint, what balls are relevant. As I have written before, this is one of Anne Kershaw’s pet peeves, and rightfully so. Under federal notice pleading rules, very few details are required in a complaint to state a cause of action. So defense counsel is often left speculating what ESI will be discoverable and relevant in a new case. Still, you have to start making educated guesses to try to find the right batch of balls. From the large selection first identified, you will eventually throw a few to the other side.

The way most teams do this is to analyze the dispute to try to determine what the issues will be in the case. This gives you a general idea of the types of balls that may come into play. Then you start to determine a general time line; hopefully the potentially relevant balls will be constrained by time. You may be able to know, with some certainty, that balls made before or after a certain time are not relevant and need not be searched. An e-Discovery Team will also try to limit the search to balls made or stored by certain key players. These are the people in your company that are likely to be involved as witnesses in the lawsuit. The Team’s search should be focused on the storage bins of these key players.

Save the Ball

After playing find the ball, the next game is save the ball. Here the Team devises ways to preserve most of the balls identified as potential evidence in the last game. Again, this can be a very challenging game, especially when your company has many different auto-destruct routines in place (and most companies do).

If you think it is easy to stop all of these programs, just ask Intel. They thought they had stopped deletion of excess email for all the key players in the anti-trust case against AMD, but in fact the janitor programs remained in place for the most important players, including the top officers of the company. Their email was deleted for years after the case was filed. They tried to play a very complicated game of save the ball, but failed. For a better idea of just how difficult this game can be, check out Intel’s report to the supervising district court judge on their failed attempts to preserve evidence. This mistake has supposedly already cost Intel millions of dollars to correct by forcing them to go to their backup tapes to find the deleted emails, and the meter is still running. AMD is, of course, claiming that the error was intentional. They would like the court to enter sanctions for spoliation and turn this mistake into an outright win of the whole case.

So make no mistake about it, save the ball is one of the most important games an e-Discovery Team plays. As I have discussed before, that is why most e-Discovery Teams focus on this game as soon as the Team is formed, and look for ways to improve their company’s litigation hold procedures.

Pick Up the Ball

Again, this game sounds easy enough, you just collect the relevant ESI from the data you have identified and preserved. Seems easy, but it is not. There are tricks and traps here aplenty. If you are not careful, you could collect too much or too little. Generally you do not want to simply pick up all of the balls you have saved. That will make the next games too expensive. You want to screen out the ones that are unlikely to be needed, and probably are not relevant at all, but were preserved just in case. You want to preserve more broadly than collect because you never want to play save the ball twice in the same case. Not only is that kind of do-over expensive, but it may be futile because, in the meantime, routine processes may have deleted many balls not saved in the first pass.

You also do not want to pick up too few balls, and leave behind many that are directly relevant and should later be thrown to the other side. That kind of careless collection can also be expensive. It can force you into an expensive do-over, and open you to charges of hiding the ball. See Eg. Court Disapproves Defendant’s “Hide the Ball” Discovery Gamesmanship.

Careless collection often occurs if the Team simply delegates this function to the key witnesses, and does not properly supervise or follow-up on their ball-picking efforts. The same comment holds true to the two prior games of ball identification and preservation. The Team cannot over-delegate its responsibility to key players and then just hope for the best. These are their games, and the Team must take responsibility to see they are played correctly. That is the whole purpose of an e-Discovery Team.

For that reason, in most cases it will not suffice to simply send out a preservation letter to the key players which describes the dispute, and then leaves it to them to find the relevant balls for themselves, save them, and pick them up. Without help and supervision from the Team, the key players may not know which of their computer files are relevant, they may not know how to properly preserve this ESI, nor how to collect it. They are sure to make mistakes. Thus, when the key players in a company are called upon to take part in the games, which in itself makes a lot of sense, since they should know their own information better than anyone else, they should be given expert help and advice from the e-Discovery Team. In other words, it is perfectly all right for the Team to delegate some of this work to the key players in the litigation, but the Team must still supervise and follow-up. Ultimately the Team should be responsible, since they are trained and more experienced in collection than the key players. The Team should have personal meetings with the key players and closely monitor their activities. In many cases, the Team should also implement certain safeguarding mechanisms to supplement the key players’ efforts, such as automated copying and keyword searches.

Another common mistake made in pick up the ball is to carelessly change the ball in the very process of picking it up. You could, for instance, change the metadata of a file, such as information as to when it was last viewed, saved, or revised. This is an especially high risk when the Team attempts to rely upon key players to pick up the ball for them. Although this probably will not matter in most cases, in some cases, such as stock backdating cases, this might be very important. As a general rule, the Team tries not to change the ball too much by the act of picking it up. The Team may later strip a file of all or part of its metadata on purpose, if that facilitates later cleaning or throwing, especially if the metadata is not important in the case, or not wanted, but they never want to do it accidentally.

A final common mistake, one of my pet peeves, is to neglect to hash the ball when you collect it, and properly preserve and tie the hash into each ball thereafter. I have described the process of using hash mathematics to authenticate ESI at length in my law review article, HASH: The New Bates Stamp, 12 Journal of Technology Law & Policy 1 (June 2007). I also provide an overview of the subject in this blog. The Team may already have hashed files as part of the preservation game; but if not, it is essential that they now be hashed at the collection stage. Hashing provides a unique identifying alphanumeric value for each computer file collected. This hash value can be later checked to prove that the file has not been altered since it was collected. This is a key step in ESI authentication to allow for admission into evidence at a hearing or trial. In most cases, hashing should be a normal part of ball pickup.

Shrink the Ball

shrink the ballShrink the ballis the game where the Team can save the company a lot of money. Thus, from a financial perspective, it is the most important game of all. In this culling step, you process the ESI to eliminate as much duplicate and irrelevant information as possible. Here good software and automated process are critical; so too is careful strategic thinking,

The goal is to significantly reduce the amount of ESI that must be reviewed and cleaned in the next steps. Thus, for instance, at the end of the last game you may have identified and preserved 1,000 gigabytes (1 terabyte) of ESI, and collected 500 gigabytes. To give you some idea of the amount of information we are speaking about, in some circumstances the 500 gigabytes may be equivalent to 500 truckloads of paper. It would cost a small fortune for teams of lawyers to read that much paper. We are talking about years of billable lawyer time to read that much data. It would also be a colossal waste of time because they would end up reading the same document dozens, if not hundreds of times. So it is critical to aggressively eliminate the redundant and immaterial ESI in this processing stage. In many cases the 500 gigs can be cut down to 100 or 50 gigs, resulting in tremendous savings in the expensive review games to come.

Clean the Ball

golf ball washerHere is where the big bucks come in, the cost to review the data for privileged, confidential, and irrelevant material. Still, most internal corporate e-Discovery Teams will not clean their own ball, they will hand it off to their caddy to do it for them, typically their outside legal counsel. A few of the more mature and well organized Teams have started to review their own data, and clean them the ESI themselves. They have teams of contract attorneys they employ to do this work at reduced rates, some even send the data to lawyers in India for review. But for most Teams, this is advanced play that they do not have the time or skill to attempt.

This is a very important and risky step in the EDRM process and companies want to be sure it is done right. You review the truckloads of email and documents that have not already been culled out in the prior games so that you can remove the files that do not have to be produced. The last thing you want to do is produce privileged materials to your adversary. You need to clean your production of these secret files and produce a log of them instead. Even with a clawback agreement, an accidental disclosure can still result in waiver of your privilege to third parties. You also want to be sure the ESI review catches all confidential materials, and that they are produced with appropriate markings and confidentiality agreements. Trade secrets can be lost forever if they become a public record by filing with a court.

Aim the Ball

Now we come to the lawyerly game of aim the ballwhere the ESI is analyzed to see how it fits into the case at hand. Here lawyers and paralegals tag each file to an issue, typically using review software. They also make final decisions as to whether and how information is responsive to discovery requests, or otherwise must be produced (or not). The files are categorized and rated for importance. Is this email a smoking gun that could kill your case, or is it merely of marginal relevance to a secondary issue? You had better find this out, and fast, as to each computer file you are about to disclose to the other side. If your analysis of the information to be produced shows you have a strong case, you will approach the case far differently than if your analysis shows you will surely lose when all of the cards are put on the table.

Obviously this analysis stage requires the sure hand and steady aim of trusted outside counsel assigned to defend or prosecute the case. Still, the legal members of the Team should assist and be involved in the analysis and evaluation of the merits of the case. This game concludes with final decisions by legal counsel on what should be produced and what should be withheld. These decisions must be rational and made in good faith.

If analysis shows you have a losing hand, you had better fold early before the other side realizes your position. You cannot do like Qualcomm and decide to withhold evidence just because you don’t like it. You can see where hiding the ball got them - they lost the patent they sued to enforce, they paid over eight million dollars in fees to the other side, their general counsel resigned in disgrace, and their outside counsel are now fighting to retain their licenses. When you are a plaintiff and find yourself in this position, you do not file the suit to begin with or, if you discover it in midstream, you should dismiss and cut your losses. The same applies when you are in a defense position. It is not an option to try and hide the evidence that will hurt your defense. You must instead try and make the most of it and settle as best as you can. That is how the American system of justice works and all Teams have to play by these same fundamental rules. Voluntary disclosure may not be the rule in the rest of the world, especially the civil law countries in Europe, but that is how the game is played here. If you are defending or prosecuting a case in the U.S., you are going to have to reveal your data to your adversary, even if that kills your case.

Throw the Ball

The last game is the culmination of all the rest. The analysis game resulted in final decisions on what files to be produced. Now you actually make the production. Throwing the ball is not really all that hard, so long as you enlist the aid of WORMs. No, not the creepy crawly kind, but the “Write Once, Read Many” times kind, such as optical discs, CDs or DVDs. The ESI on these media cannot be altered after written onto the discs, thus providing you, and the receiving party, with a certain amount of protection that the files will not be accidentally altered. Worms help the parties maintain a permanent record of the ESI produced.

Another tricky aspect of production is deciding the form of production. Do you produce in native format with full internal metadata retained, or do you produce in a TIFF or JPEG format with a load file ready for import into review software? This should have already been worked out with opposing counsel as part of the Rule 26(f) conference, or the original production request; but if not, you have to make these decisions now.

Take the time to clearly mark and label the production media. One thing I hate is a CD production with no writing on it, or just indecipherable handwriting. Write out a full description of the CD and the date of production and name of the case. Think of chain of custody issues and do not forget to make multiple copies. Another thing I have noticed lately is the use of paper labels on CDs. That’s ok, but beware of labels that peel off. As a safeguard, it is better to use ink jet printers that print directly on the CD, instead of glue on labels. If you must use adhesive labels, put some kind of writing directly on the CD itself, just in case it peels off somewhere down the line.

Finally, if you use TIFF or other image type files where you affix Bates stamp type markings to identify individual ESI files, please consider adding a truncated hash value to the file ID. As discussed in HASH: The New Bates Stamp, this will facilitate both identification and authentication, and allow for easier comparison with the native originals.

Concluding Thoughts

These games are difficult. Much like golf, it is not a game of perfect. Mistakes are inevitable. Even Tiger Woods messes up from time to time, and does not win them all, so why should you be any different? Document your efforts, play it safe, and use redundant systems whenever economically feasible. Thus, when a mistake is later discovered, you may be able to cover it with a backup system. Or, if that is not possible, you can at least show to the supervising judge that you made good faith, reasonable efforts. The judge should understand and cut you a break, maybe even give you a mulligan. If the judge does not realize that mistakes are inevitable, he or she simply does not understand the game. Then it is up to you to explain it to them, or hire an expert who can.


Venue Analysis Transformed by e-Discovery and the Digitization of Society

March 16, 2008

flat-earth.jpgThe location and availability of documents have always been important considerations in determining whether venue should be transferred for “the convenience of parties and witnesses, in the interest of justice.” 28 U .S.C. §1404(a). Not any more. Two new cases have shown that the digitization of records and electronic discovery are quickly rendering these criteria obsolete. Victory Intern. (USA) Inc. v. Perry Ellis Intern., Inc., 2008 WL 65177 (D.N.J., Jan. 2, 2008); ICU Medical, Inc. v. RyMed Technologies, Inc., 2008 WL 205307 (D. Del., Jan. 23, 2008).

When parties argue about venue, and what court is better situated to hear a case, they argue about convenience. These arguments not only include the location of witnesses, but, traditionally, also the location of records. If a court is located closer to where the original paper records are stored, that is supposed to be a factor favoring selection of that court.

This law developed in the last century when almost all documents were paper. At that time it made sense. If a case involved a warehouse full of paper records, then the proximity of a court to that warehouse was an important consideration. It made document productions, depositions, hearings and trial easier and less expensive. The proximity to original records was one of several factors of convenience and justice that a court would consider in determining whether to transfer venue.

In this new century, well over ninety percent of business and other records are now in digital form. The electronically stored information can be fairly easily, and sometimes almost instantly, transferred from one computer to another, regardless of where the computers are located. With hash verification in place, the electronic document in the second computer is just as much an original as the electronic document in the first. It is exactly the same computer file. Paper printouts of these multiple original electronic records can then be made anywhere on demand. For instance, they can be made if and when needed for depositions, hearings and trial. The proximity of the courthouse to the location of the computers used to create and store the information is irrelevant.

Victory Intern. v. Perry Ellis Intern.

The Victory case involves a fight over perfume and the right of Victory International to distribute Perry Ellis fragrances. Victory Intern. (USA) Inc. v. Perry Ellis Intern., Inc., 2008 WL 65177 (D.N.J., Jan. 2, 2008). It was brought by a single plaintiff in New Jersey against a host of companies and individuals. As is common in anti-trust type cases like this, the plaintiff, Victory, alleged a long list of complaints:

Victory seeks relief against the defendants for violation of Section 1 of the Sherman Act (15 U.S.C. § 1), violation the Donnelly Act (N.Y. Gen. Bus. Law §§ 340-347), interference with contract, interference with prospective business advantage, breach of contract, fraud, deceit, unjust enrichment, violations of the Florida RICO statute, deceptive trade practices, restraints of trade, and unfair competition under the common law of the State of New Jersey and the other states of the Union.

The plaintiff, and a couple of the defendants, were located in New Jersey where the suit was filed, but thirteen of the defendants were located in South Florida. The defendants moved the District Court in New Jersey to transfer the case to the District Court in Miami. The parties all agreed that both courts had jurisdiction, so it was strictly a venue issue.

Senior New Jersey District Court Judge Walls begins his analysis with the federal statute governing venue transfer: 28 U .S.C. § 1404(a). It states that if two District Courts have jurisdiction, then for “the convenience of parties and witnesses, in the interest of justice, a district court may transfer any civil action to any other district or division where it might have been brought.” Judge Walls goes on to explain:

A determination of whether to transfer must incorporate “all relevant factors to determine whether on balance the litigation would more conveniently proceed and the interests of justice be better served by transfer to a different forum.” Jumara v. State Farm Ins. Co., 55 F.3d 873, 879 (3d Cir.1995) “Transfer analysis under Section 1404 is flexible and must be made on