2009. július 24., péntek

How to validate ISBN's?


Problem/Question/Abstract:

ISBNs (or International Standard Book Numbers) are mystical code numbers that uniquely identify books. The purpose of this article is to remove the mystery surrounding the structure of the ISBN, allowing applications to perform data validation on entered candidate ISBNs.

Answer:

ISBNs are composed of thirteen characters, limited to the number characters "0" through "9", the hyphen, and the letter "X". This thirteen-character code is divided into four parts, each separated by hyphens: group identifier, publisher identifier, book identification for the publisher, and the check digit. The first part (group identifier) is used to identify countries, geographical regions, languages, etc. The second part (publisher identifier) uniquely identifies the publisher. The third part (book identifier) uniquely identifies a given book within a publisher's collection. The fourth and final part (check digit) is used with the other digits in the code in an algorithm to derive a verifiable ISBN. The number of digits in the first three parts of an ISBN may contain a variable number of digits, but the check digit will always consist of a single character (between "0" and "9", or "X" for a value of 10) and the
ISBN as a whole will always consists of thirteen characters (ten numbers plus the three hyphens dividing the four parts of the ISBN).

The ISBN 3-88053-002-5 breaks down into the parts:

  Group:       3
  Publisher:   88053
  Book:        002
  Check Digit: 5

An ISBN can be verified to be a valid code using a simple mathematical algorithm. This algorithm takes each of the nine single digits from the first three parts if the ISBN (sans the non-numeric hyphens), multiplies each single digit by a number that is less than eleven the number of positions from the left each digit that is in the ISBN, adds together the result of each multiplication plus the check digit, and then divides that number by eleven. If that division by eleven results in no remainder (i.e., the number is modulo 11), the candidate ISBN is a valid ISBN. For example, using the previous sample ISBN 3-88053-002-5:

  ISBN:              3  8  8  0  5  3  0  0  2  5
  Digit Multiplier: 10  9  8  7  6  5  4  3  2  1
  Product:          30+72+64+00+30+15+00+00+04+05 = 220

Since 220 is evenly divisible by eleven, this candidate IDBN is a valid ISBN code.

This verification algorithm is easily translated into Pascal/Delphi code.
String manipulation functions and procedures are used to extract the check digit and the remainder of the ISBN from the String type value passed to a validation function. The check digit is converted to Integer type, which forms the start value of the aggregate variable onto which the multiplication of each digit in the remainder of the ISBN (the single digits that comprise the first three parts of the ISBN) will be added. A For loop is used to sequentially process each digit in the remainder, ignoring the hyphens, multiplying each digit times its position in the ISBN remainder relative to the other digits in the remainder. The final value of this aggregate variable is then checked to see whether it is evenly divisible by eleven (indicating a valid ISBN) or not (indicating an invalid candidate ISBN).

Here is an example of this methodology applied in a Delphi function:

function IsISBN(ISBN: string): Boolean;
var
  Number, CheckDigit: string;
  CheckValue, CheckSum, Err: Integer;
  i, Cnt: Word;

begin
  // Get check digit
  CheckDigit := Copy(ISBN, Length(ISBN), 1);
  // Get rest of ISBN, minus check digit and its hyphen
  Number := Copy(ISBN, 1, Length(ISBN) - 2);
  // Length of ISBN remainder must be 11 and check digit between 9 and 9 or X
  if (Length(Number) = 11) and (Pos(CheckDigit, '0123456789X') > 0) then
  begin
    // Get numeric value for check digit
    if (CheckDigit = 'X') then
      CheckSum := 10
    else
      Val(CheckDigit, CheckSum, Err);
    // Iterate through ISBN remainder, applying decode algorithm
    Cnt := 1;
    for i := 1 to 12 do
    begin
      // Act only if current character is between "0" and "9" to exclude hyphens
      if (Pos(Number[i], '0123456789') > 0) then
      begin
        Val(Number[i], CheckValue, Err);
        // Algorithm for each character in ISBN remainder, Cnt is the nth character
        //    so processed
        CheckSum := CheckSum + CheckValue * (11 - Cnt);
        Inc(Cnt);
      end;
    end;
    // Verify final value is evenly divisible by 11
    if (CheckSum mod 11 = 0) then
      IsISBN := True
    else
      IsISBN := False;
  end
  else
    IsISBN := False;
end;

This is a simplified example, kept simple to best demonstrate the algorithm to decode ISBNs. There are a number of additional features that would be desirable to add for use in a real-world application. For instance, this example function requires the candidate ISBN be passed as a Pascal String type value, with the hyphens dividing the four parts of the ISBN. Added functionality might accommodate evaluating candidate ISBNs entered without the hyphens. Another feature that might be added is checking that ensures three hyphens are properly included, as opposed to just thirteen number characters.

Nincsenek megjegyzések:

Megjegyzés küldése