-
Notifications
You must be signed in to change notification settings - Fork 0
/
Copy pathcounting.html
268 lines (230 loc) · 11.3 KB
/
counting.html
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
158
159
160
161
162
163
164
165
166
167
168
169
170
171
172
173
174
175
176
177
178
179
180
181
182
183
184
185
186
187
188
189
190
191
192
193
194
195
196
197
198
199
200
201
202
203
204
205
206
207
208
209
210
211
212
213
214
215
216
217
218
219
220
221
222
223
224
225
226
227
228
229
230
231
232
233
234
235
236
237
238
239
240
241
242
243
244
245
246
247
248
249
250
251
252
253
254
255
256
257
258
259
260
261
262
263
264
265
266
267
268
<html>
<head>
<title>Pracc: Page Counting</title>
<style type="text/css">
body { margin:2pc }
pre { margin:10px; padding:5px;
border:1px solid black; background-color:#ccc }
h1,h2,h3 { font-family:sans-serif }
th { font-family:sans-serif; text-align:left }
td.code { background-color:#ccc }
</style>
</head>
<body>
<p><i>Written by Urs-Jakob Rüetschi<br/>
as part of the <b>pracc</b> project.</i></p>
<h1>Page Counting</h1>
<p>Sending a print job to a networked printer is easy.
Counting how many pages get printed is delicate.
Basically, there are two approaches:</p>
<ol>
<li>Counting the pages in the print job</li>
<li>Asking the printer how many pages it printed</li>
</ol>
<p>Under optimal circumstances, the two approaches produce
the same figure for a print job. In real life, however,
circumstances are not optimal: there could be a lack of paper,
a paper jam, a network disruption, or just somebody fiddling
with the printer while it prints. These and many other problems
tell us that page counting is never 100% accurate!</p>
<p>With the first approach, counting pages in the print job,
the page description language has to be known. For example,
if it is PostScript, a tool like GhostScript can be used to
count the page in the job. But many other page description
languages exist, some known, some proprietary. They change
with printers and drivers and versions.</p>
<p>Further problems with counting pages in the print job are
the use of CPU cycles on the print server, that it can be
tricked by the skilled user, and the opening of a serious
security hole: the execution of user software (namely the
print job) on a server, probably even as a privileged user.
These problems may be considered theoretical problems.
A very real problem ist that counting pages in a print job
tends to count more pages than actually get printed and
users tend to be sensitive in that respect...</p>
<p>So we ask the printer about the pages printed for a job.
The printer the only instance involved in printing that is
authoritative on the number of pages printed. Indeed, most
printers count precisely how many pages they printed since
they left the factory. They do this using a <i>page counter</i>,
a hardware register that is increased whenever a page is printed
but never gets decreased. Idea: read this value before and after
printing a job -- the difference is the number of pages printed,
irrespective of paper jams and similar problems.</p>
<p>I know of three methods to read the page counter value:</p>
<ol>
<li>using Adobe's PostScript</li>
<li>using HP's PJL (printer job language)</li>
<li>using SNMP (Simple Network Management Protocol)</li>
</ol>
<p>Unfortunately, this page counter is meant for maintenance,
not for accounting, and there is no general method of associating
its current value with a particular print job: We do not know
when the printer is done printing a print job and, hence, when
to read the page counter. The printer may finishes printing
long after the last byte of the job has been sent and the
network connection is closed.</p>
<p>An obvious solution is to use a heuristic: poll the page
counter until it stops increasing for some time. This requires
a parameter, the time <i>t</i> to wait for another increase of
the page counter. By increasing <i>t</i> the accounting becomes
more reliable but printing gets slower (I had complaints about
that inevitable delay). Moreover, the method can be defeated:
just create a print job that waits for at least the time <i>t</i>
and only then starts printing... Ordinary users won't do this
but still it's possible (at least with PostScript).</p>
<p>More help is offered by HP's Printer Job Language (PJL).
PJL provides asynchronous notification about job start/end
and the pages printed. It works fine in theory and mostly
so in practice (after all, it was designed with accounting
in mind). Nevertheless, I've found that some print jobs on
some printer models fooled PJL's page accounting (those
jobs were not hand-tailored, but generated by HP's own
drivers with certain device settings -- and therefore
probably revealed a bug in the PJL implementation of the
afflicted printers). Even if it works, it should be
remembered that PJL is an HP thing, but I've found that
more and more other printer manufacturers support it.</p>
<p>I've not yet tried SNMP, but I fancy it suffers the
same problem as the PostScript method: after all, it has
to rely on timeouts that can significantly slow down the
printing system.
On the other hand, the SNMP method is probably immune
against the types of problems mentioned for the PJL method,
because it is hardly affected by the print job.</p>
<p>Finally, complete (and complex) commercial solutions are
offered by some companies. The problem with those is, apart
from the costs for installation, integration, licensing, and
maintenance, that they tie you to a particular company and
their products. Sales representatives will claim the their
system is <em>open</em> and therefore works with any printer
(and copier), but this "open" usually means nothing
more than that they are willing to work with you towards
a solution if you pay them (or are a really big and important
customer). Big commercial printing systems are nice if you can
afford them <em>and</em> if you can start from scratch without
any legacy systems. I don't know anything about the accuracy
and robustness of those systems.</p>
<p><b>Summary:</b>
Printer accounting using open standards is never 100% accurate.
Accuracy when using proprietary systems is not known.
My experience with the "open standards methods"
is more than satisfying. But I'm working at a school with
students that are unlikely to take joy in hacking the
accounting system. Finally, it is an open question whether
printer accounting is worth the effort! More than often
it is significant work to track down insignificant amounts
of money.</p>
<h2>Technical Details</h2>
<ul>
<li><a href="#ps">PostScript</a></li>
<li><a href="#pjl">PJL</a></li>
<li><a href="#snmp">SNMP</a></li>
</ul>
<a name="ps"></a>
<h3>Page Counting with PostScript</h3>
<p>The printer's pagecount hardware register can be read
through PostScript. It is convenient to wrap the pagecount
into a PostScript message so that it can be parsed along
with other PostScript messages.</p>
<p>To avoid confusion with pagecount messages from previous
print jobs or even to guard against maliciously generated
messages, a random "cookie" value should be included
in the pagecount message. The returned cookie can be used to
determine if the pagecount message is genuine.</p>
<pre>%!PS
(%%[ pagecount: ) print
statusdict begin pagecount end
20 string cvs print
(; cookie: 99999 ]%%) print flush</pre>
<p>The pagecount value is put on the stack, converted
to a string representation, and finally printed to the
printer's standard output, formatted as a PostScript
message that also contains the cookie value:</p>
<pre>%%[ pagecount: 12345; cookie: 99999 ]%%</pre>
<p>The <b>pscount</b>(<i>fd</i>, <i>cookie</i>) routine
can be used to send the above PostScript program, containing
the given <i>cookie</i>, to the given file descriptor <i>fd</i>.
On receipt of a syntactically correct pagecount message, the
PostScript message parser, <b>psparse</b>, sets the global
variables <i>ps_pagecount</i> and <i>ps_cookie</i>. The caller
of <b>psparse</b> should then check if <i>ps_cookie</i> is
identical to the cookie that was passwd to <b>pscount</b>
and, if so, use <i>ps_pagecount</i> to update the program's
record of the initial or the final pagecount.</p>
<p>Unfortunately, there is no known way to determine the end
of a print job using PostScript. The best we can do is read
the pagecount repeatedly until it remains stable for some time.
Of course, this is only a heuristic and easily be fooled, for
example by carefully preparing a print job that includes a
delay loop...</p>
<a name="pjl"></a>
<h3>Page Counting with PJL</h3>
<p>HP's Printer Job Language (PJL) has features that specifically
support page-based accounting: By requesting "unsolicited
status" messages, the printer informs the host about pages
printed and print job start and end. Besides, it is also possible
to query the printer's pagecount register using PJL.</p>
<p>The function names refer to my low-level PJL routines for
generating PJL statements and for parsing the PJL response messages.
The structure of a print job for page counting should be:</p>
<table border="1" cellpadding="2">
<tbody>
<tr><th>Print Job</th><th>Comments</th></tr>
<tr><td class="code"><tt>UEL@PJL</tt></td>
<td><b>pjluel</b>(<i>fd</i>)</td></tr>
<tr><td class="code"><tt>@PJL ECHO <i>cookie</i></tt></td>
<td><b>pjlecho</b>(<i>fd</i>, <i>cookie</i>)</td></tr>
<tr><td class="code"><tt>@PJL INFO PAGECOUNT</tt></td>
<td><b>pjlcount</b>(<i>fd</i>)<br><i>optional</i></td>
<tr><td class="code"><tt>@PJL USTATUS JOB = ON<br>
@PJL JOB NAME = "<i>jobid</i>"<br>
UEL</tt> or <tt>ENTER LANGUAGE</tt></td>
<td><b>pjljob</b>(<i>fd</i>, <i>jobid</i>, 0 or "PCL" or "POSTSCRIPT")</td></tr>
<tr><td class="code"><tt>%!PS<br>showpage</tt></td>
<td>Send print job data.<br>
Use a select loop and process messages<br>
that may be sent back from the printer</td></tr>
<tr><td class="code"><tt>UEL@PJL</tt><br><tt>@PJL EOJ NAME="<i>jobid</i>"</td>
<td><b>pjleoj</b>(<i>fd</i>, <i>jobid</i>)</td></tr>
<tr><td colspan="2">Wait for "unsolicited" PAGE and JOB messages<br>
using a select loop; process incoming messages.</td></tr>
<tr><td class="code"><tt>UEL@PJL USTATUSOFF<br>UEL</tt></td>
<td><b>pjloff</b>(<i>fd</i>)<br>
<b>pjluel</b>(<i>fd</i>)<br>
This is important to avoid USTATUS messages<br>
now that we are no longer interested.</td></tr>
</tbody>
</table>
<p>The code that reacts on the messages received back from the printer
has to be careful not to interpret messages from previous print jobs.
That's the purpose of the PJL ECHO statement in the print job. This is
easiest handled in four sequential phases:</p>
<blockquote>INIT >> SYNCED >> INJOB >> DONE</blockquote>
<p>The transition from INIT ty SYNCED is triggered by the arrival
of our @PJL ECHO <i>cookie</i> message; the transition from SYNCED
to INJOB is by the @PJL USTATUS JOB START message; the transition
from INJOB to DONE by the @PJL USTATUS JOB END message.
Unexpected ECHO or USTATUS JOB messages should be ignored.</p>
<p>@PJL USTATUS PAGE messages should be processed in the phase
INJOB by updating the <i>pages</i> variable and issuing a
"PAGE: <i>n</i> 1" log line for the CUPS scheduler
to update the job-media-sheets-completed attribute.</p>
<p>A @PJL INFO PAGECOUNT message in phase SYNCED should be used
to set the <i>pagecount</i> variable. In other phases, such
messages should be ignored.</p>
<p>A @PJL USTATUS JOB END message in phase INJOB should be used
to set the <i>pages</i> variable that is to be used for accounting.
After receipt of the JOB END message, be sure to issue a USTATUSOFF
statement to turn "unsolicited status" messages off.</p>
<a name="snmp"></a>
<h3>Page Counting with SNMP</h3>
<p>The printer's pagecount register is part of the standard
printer MIB and therefore can be queried using SNMP. An open
problem is, as with the PostScript method, determining when
the job has finished printing. I have not (yet) implemented
this method.</p>
<div><hr>
Copyright (c) 2003-2007 by Urs-Jakob Rüetschi</div>
</body>
</html>