The Textual Data Base

Intralinear Transliteration Codes

Giorgio Buccellati – January 2010

Level 1 : intralinear
1a: column format
1b: line format
1c: graphic unit, graphic condition, graphemic value
1d: sign level code
1e: word level code
1f: embedded note

     I give here the definition for file format and for text encoding as provided in the original 1987 disk. The content is the same (and is given in black), since it is still applicable, but it has been updated for browser display.


Level 1 : intralinear

1a: column format. Columns are indicated by Arabic numerals
to conform to the published edition of the text. Should a column be
be omitted in the published edition, this column will receive an
"a" (e.g. 3a refers to the column following 3 omitted in column
numeration in the published edition of the text). Should
a break in the text precludes the clear delimitation of columns,
the designation ?+ will be utilized to reflect the unknown number
of columns. When column sequence is to be inverted, column numeration
will continue sequentially, and an ! will follow the column number.

The front and back of a tablet are designated as:
r front
v back

Writing on the edges of a tablet is designated as:
le left edge
re right edge

The condition of a column may be further specified as:
cb1 the beginning of the column is broken
cb2 the end of the column is broken
cb3 the entire column is broken
cb4 break within column of indeterminate length
cb99 an unknown number of columns destroyed

The designations for broken columns are used only when number
of columns cannot be profitably approximated.

ce1 the beginning of the column is blank
ce2 the end of the column is blank
ce3 the entire column is blank
ce4 blank space of indeterminate length in the middle of
the column

cr1 beginning of column erased
cr2 end of column erased
cr3 entire column erased
cr4 an erasure of indeterminate length in the middle of the

1b: line format. Each line must be provided with published line
number. If none exists, one will be provided. A 0 number may
be utilized at the beginning of a column to provide further
information about the text. A prime number (e.g. 1')
will be used when a break in the text precludes sequential
numbering from top to bottom.

1c: graphic unit, graphic condition, graphemic value

letters: a b d ... for phonemic values
A B D ... for logograms
*A *B *D ... for unknown readings (i.e. signs
which are identifiable, but whose
reading in context is uncertain)
' ... aleph
c ... ayin
s` ... sade
s^ ... shin
t` ... tet
t^ ... tha
h ... khet
q ... qof
: ... "Glossenkeil"

Numerals: Arabic numerals are used exclusively, and are written so as
to preserve the manner in which they appear graphically in the
text. A hyphen separates the groups of units, tens and sixty-
signs. For example, the number 94 would be transliterated
as 60-30-4; 60 represents one sign, 30 represents three tens
and 4 represents four units. Optionally, in cases of textual
ambiguity, an abbreviation may be added to render more explicitly
the form of the sign (whether curviform or wedge), and the
graphic orientation of the sign. Curviform signs are defined
as those signs formed with the blunt cylindrical end of the

(w) wedge (c) curviform
(wh) horizontal wedge-shaped (ch) horizontal curviform-shaped
(wv) vertical wedge-shaped (cv) vertical curviform-shaped
(ws) slanted wedge-shaped (cs) slanted curviform-shaped

Fractions are written as 1/2, 1/4, 1/3.

condition of text: X a single unreadable sign
N a single unreadable sign representing a
... undefined sequence of broken signs
| between two readings designates a
partially broken sign whose clear
identification is uncertain
[] restored
[^]^ broken sign or sequence of signs
<> added by modern editor
<<>> mistakenly written by scribe
<<<>>> modern correction of ancient error
<>-<<>> reflects the insertion of a sign, and
the deletion of another. This is
used in cases when a sign is written
resembling the intended sign, and is
so corrected by the modern editor.
? after sign for uncertain reading
! after sign for abnormal graphic writing
!! after sign for divergence from
published transliteration
!!! after sign designating both abnormal
graphic writing and divergence from
published transliteration

references to sign lists: u0(A) the sign "A" is being read
with a value of "u"

10 (GUR) number 10 in the series

graphic markers: ln1 calligraphic rule
ln2 string rule
ln3 blank case
ln4 blank space at the beginning of a case
ln5 blank line at the end of a case
vt vertical line
rs~A erasure of an identifiable sign "A"
When a sequence of identifiable signs
is erased, the symbol rs~ will precede
rs1,2,3 a specification of the approximate
number of signs erased
rs99 an erasure of an entire line
dt1,2,3 indentation with designation of
approximate number of signs indented

graphic relationships: (carriage return) line boundary
(blank) word boundary
- sign boundary
. intralogographic boundary (e.g. PA.TE
for EN5)
@+ ligature (e.g. [email protected]+na)
@x inclusion (e.g. [email protected])
@^ in front of each sign designates a
superscript word (e.g. @^[email protected]^na)
@' in front of a superscript sign
@. in front of each sign designates a
subscript word (e.g. @[email protected])
@> in front of graphically small signs
@: in front of a graphically small word
@< in front of a sign lacking its usual
complement of strokes (e.g. @ @\ following a "tenu:" sign (e.g. [email protected]@\)
@; following a "gunu:" sign (e.g. [email protected];)
@| between two or more signs which
appear vertically atop each other
(e.g. [email protected]|AN)
@# before a sign written upside down
(e.g. @#UD)

The following designations are used exclusively to clarify
graphic relationships of numerals:

+ used in broken contexts to reflect that the numerals
so connected are regarded as a unit (e.g. [5+]3)
-: indicates that the number which follows qualifies
the preceding sign or sequence of signs
(e.g. sa-ha-wa-:2)
:- indicates that the number which precedes qualifies
the following sign or sequence of signs
(e.g. 2 3:-*NI)
+: indicates that the numeral is written as a ligature
(e.g. IB2+:2)
x: indicates that the numeral is written as an inclusion
(e.g. IB2x:2)

1e: word level code
(i) graphemic level

Preposed determinatives are immediately followed by = and
the sign boundary - (e.g. DINGIR=-'a3-da; MI2=-is"-tar-um-mi;
I=-mu-ka-an-ni-s"i-im). Postposed determinatives are
immediately preceded by = which is preceded by the sign
boundary designation - (e.g. ib-la-=KI; UD 2-=KAM;

Preposed phonetic complements are followed by the symbols =+
and postposed phonetic complements are preceded by the symbol +=
(e.g. LIM=+*LULIM+=LU; here LIM is a preposed complement while
LU is a postposed complement.

(ii) non-graphemic level
d_ any letter followed by an underscore preceding any single
sign. This code applies to either sign or word, hence
character used will distinguish between two levels. These
categorizations do not reflect written signs, but the
interpretation of a sign or unit of signs, particularly for
onomastics. These codes may be cumulative.
p_ personal name, gender unknown
f_ human feminine name
F_ divine feminine name
m_ human masculine name
M_ divine masculine name
D_ divine name, gender unknown
g_ geographical names
n_ other proper names
examples: g_ib-la-=KI
ITI n_za-lul

1f: embedded note

(space) !text of note! (space)

This convention is used regularly within this version of the Ebla corpus
to direct attention to the broken sign lists compiled by the editors of
the "Archivi Reali di Ebla Testi" (ARET) volumes. Should a broken sign
or sequence of signs be represented in the tables of the ARET volumes,
the embedded note designation will provide the page number upon which the
sign(s) appear. For example, !ARET2 p.167! indicates that the broken
sign on this particular line is to be found represented in ARET volume
2 page 167.
