Jump to content

KDE PIM/KItinerary/Trenitalia Barcode: Difference between revisions

From KDE Community Wiki
Vkrause (talk | contribs)
Vkrause (talk | contribs)
No edit summary
Line 66: Line 66:
| 66:0 - 66:7 || || || varies between samples, same as 6:4 - 7:3?
| 66:0 - 66:7 || || || varies between samples, same as 6:4 - 7:3?
|}
|}
== PNR Decoding Table ==
Based on all available samples, PNR strings map to the PNR binary encoding as listed in the table below:
{| class="wikitable"
! Code !! Symbol
|-
| 0 || W
|-
| 4 || 2
|-
| 6 || 3, Z
|-
| 8 || 4
|-
| 10 || 5
|-
| 12 || 6
|-
| 14 || 7
|-
| 18 || 9
|-
| 22 || B
|-
| 24 || C
|-
| 26 || D
|-
| 28 || E
|-
| 30, 31 || F
|-
| 32 || G
|-
| 34 || H
|-
| 38, 39 || J
|-
| 40 || K
|-
| 42, 43 || L
|-
| 44 || M
|-
| 46 || N
|-
| 50 || P
|-
| 52 || Q
|-
| 55 || R
|-
| 56 || S
|-
| 59 || T
|-
| 60 || U
|-
| 62 || V
|}
This is obviously incomplete due to too few samples. Observations/speculations:
* the lowest bit seems irrelevant for which symbol a code maps to, the meaning of that bit is unknown
* there are only 32 rather than 36 symbols used, probably omitting those that are hard to differentiate (1/I or O/0).
* the gaps in the table are suspected to be 2 -> X, 12 -> 8, 20 -> A, 36 -> I, 48 -> O
* the only conflicting sample with that theory is the clash on 6 -> 3/Z.


== Open Questions ==
== Open Questions ==

Revision as of 17:46, 12 June 2019

General Observations

  • always 67 bytes
  • exactly for one passenger/leg
  • does not seem to contain signatures, checksums or compression, based on minimal bit pattern changes on adjacent tickets
  • a lot less null bytes when PNR/seat reservation are present (ie. highspeed train tickets?)
  • there is a unique, globally sequential ticket number, 2017 in the 600M range, 2019 in the 1B range, which suggests 32bit might be a bit short for this

Bit Layout

Byte:Bit (MSB) Content Meaning Notes
0:0 - 4:7 0x20 0x14 0xC2 0x08 0x10 header? fixed value in all samples
5:0 - 7:7 date? varies between samples
8:0 - 13:4 null
13:5 - 14:3 100 xxxx 0000 for Italian stations codes (UIC 83.....), 0111 for international destination?
14:4 - 17:3 24 bit uint UIC station code of departure only for Italian destinations apparently
17:4 - 18:2 100 xxxx see above
18:3 - 21:2 24 bit uint UIC station code of arrival see above
21:3 - 22:1 unknown 22:1 seems 0 in all samples, the rest varies
22:2 - 24:1 16 bit uint train number could be as little as 14 bits too, no train number > 16k has been observed
24:3 - 29:7 null
30:0 - 31:1 null if no PNR present, content unknown
31:2 - 32:0 7 bit uint seat number seat row for trains with an airplane-like numbering scheme
32:1 - 32:2 null
32:3 - 32:6 4 bit uint seat column as hex number 0 for trains without an airplane-like numbering scheme
32:7 - 33:6 null?
33:7 - 38:3 6x6 bit as listed below PNR all null if ticket has no PNR
38:4 -43:3 null
43:4 - 44:2 1010 011 or null Issuer UIC code only set if PNR is present, "83" for Trenitalia tickets otherwise, a few preceding bits are likely part of this fields as well
44:3 -45:7 null
46:0 - 48:7 unknown null if no PNR present, unknown content otherwise
49:0 - 49:7 null
50:0 - 50:7 unknown null if no PNR present, unknown content otherwise
51:0 - 57:7 0x0B 0x65 0x23 0x18 0x40 0xE6 0xC0 fixed in all samples?
58:0 - 58:3 null might be part of the ticket number?
58:4 - 62:3 32 bit uint ticket number
63:4 - 65:7 null
66:0 - 66:7 varies between samples, same as 6:4 - 7:3?

PNR Decoding Table

Based on all available samples, PNR strings map to the PNR binary encoding as listed in the table below:

Code Symbol
0 W
4 2
6 3, Z
8 4
10 5
12 6
14 7
18 9
22 B
24 C
26 D
28 E
30, 31 F
32 G
34 H
38, 39 J
40 K
42, 43 L
44 M
46 N
50 P
52 Q
55 R
56 S
59 T
60 U
62 V

This is obviously incomplete due to too few samples. Observations/speculations:

  • the lowest bit seems irrelevant for which symbol a code maps to, the meaning of that bit is unknown
  • there are only 32 rather than 36 symbols used, probably omitting those that are hard to differentiate (1/I or O/0).
  • the gaps in the table are suspected to be 2 -> X, 12 -> 8, 20 -> A, 36 -> I, 48 -> O
  • the only conflicting sample with that theory is the clash on 6 -> 3/Z.

Open Questions

  • there is one sample where the departure and arrival UIC station codes are equal, contrary to what's in the corresponding PDF
  • for international destinations the station code does not actually seem to be a valid UIC station code, but there is only one sample to back this up so far
  • It seems plausible that the coach number is also encoded given the seat number is, 30:0 - 31:1 would seem like the obvious range. In the sample data this does not seem the case with any obvious encoding.
  • Is the train type derived from the train number, or is that also encoded? 21:3 -22:1 would seem to be the obvious range for that, but so far no correlation to current samples found.
  • Is the class encoded somewhere?
  • Date/time encoding is still a mystery.
  • Is the PNR encoded? 46-50 seems like a plausible range for that. 36^5 possible values, ie. needs >= 26 bit.
  • What is the "CP Code" in the PDF, and is that encoded somewhere? -> unlikely based on comparing adjacent codes