|
![]()
"We love the BioTime system!! ... the results are way beyond our original expectations!! ... The system paid for itself within the first week!! We cut our salaries & external resources expenses by almost 40% in the first month!!... I would strongly recommend the use of the Bio-Metrica BioTime Time Clock software as a way to control cost and manage your employee’s schedules" Click here for more info Alberta Myles |
||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
Q. Which measures reflect the effectiveness of a biometric authentication system? |
False Acceptance Rate (FAR)
The FAR is the frequency that a non authorized person is accepted as authorized. Because a false acceptance can often lead to damages, FAR is generally a security relevant measure. FAR is a non-stationary statistical quantity which does not only show a personal correlation, it can even be determined for each individual biometric characteristic (called personal FAR).
False Rejection Rate (FRR)
The FRR is the frequency that an authorized person is rejected access. FRR is generally thought of as a comfort criteria, because a false rejection is most of all annoying. FRR is a non-stationary statistical quantity which does not only show a strong personal correlation, it can even be determined for each individual biometric characteristic (called personal FRR).
Failure To Enrol rate (FTE, also FER)
The FER is the proportion of people who fail to be enroled successfully. FER is a non-stationary statistical quantity which does not only show a strong personal correlation, it can even be determined for each individual biometric characteristic (called personal FER).
Those who are enroled yet but are mistakenly rejected after many verification/identification attempts count for the Failure To Acquire (FTA) rate. FTA can originate through temporarily not measurable features ("bandage", non-sufficient sensor image quality, etc.). The FTA usually is considered within the FRR and need not be calculated separately, see also FNMR and FMR.
False Identification Rate (FIR)
The False Identification Rate is the probability in an identification that the biometric features are falsely assigned to a reference. The exact definition depends on the assignment strategy; namely, after feature comparison, often more than one reference will exceed the decision threshold.
Further Implicit Measures
False Match Rate (FMR). The FMR is the rate which non-authorized people are falsely recognized during the feature comparison. In contrast to the FAR, attempts previously rejected due to poor (image-) quality (Failure to Acquire, FTA) are not accounted for. Whether a falsely recognized biometric characteristic leads to increases in FAR or FRR depends upon the application. (There are applications, which define a successful recognition as a rejection, when, for example, double release of identification cards for a person with a false identity is prevented by comparing the actual reference features with the centrally stored reference features of all cards released so far.)
False Non-Match Rate (FNMR). The FNMR is the rate that authorized people are falsely not recognized during feature comparison. In contrast to the FRR, attempts previously rejected due to poor (image-) quality (Failure to Acquire, FTA) are not accounted for. Whether a falsely recognized biometric characteristic leads to increases in FAR or FRR depends upon the application.
Q. How is the Failure-to-Enrol Rate (FER/FTE) defined in detail? |
Due to the statistical nature of the failure-to-enrol rate, a large number of enrolment attempts have to be undertaken to get statistical reliable results. The enrolment can be successful or unsuccessful. The probability for lack of success (FER(n)) for a certain person is measured:
FER(n) = |
Number of unsuccessful enrolment attempts for a person (or feature) n Number of all enrolment attempts for a person (or feature) n |
These values are better with more independent attempts per person/feature. The overall FER for N participants is defined as the average of FER(n):
FER = |
1 N |
|
FER(n) |
The values are more accurate with higher numbers of participants (N). Alternatively, the median value may be calculated.
Finally, the result of an enrolment attempt has to be defined exactly:
An enrolment attempt is successful if the user interface of the application provides a "successful"- or "finished" message.
An enrolment attempt is unsuccessful if the user interface of the application provides an "unsuccessful" message.
In cases where no defined completion is available, a fixed enrolment time interval has to be given to ensure comparability. If the time interval has expired the enrolment attempt is counted unsuccessful.
Q. What needs to be considered in the definition of FRR? |
Even though the false rejection rate, FRR, is intuitively easy to understand, there can be many problems when trying to fix an unequivocal or universal definition. The following must be taken into account:
The FRR is a statistical value whose measurement accuracy depends on the number of measurements. Now the FRR is not only dependent on the biometric system, but on the users as well. There is thus a personal FRR. If one wants to deal with large numbers of people, it is important that the end result is not negatively affected by an individual. Such could occur when the number of attempts per person differs. This problem can be avoided, if one first identifies each personal FRR curve and calculates the mean from those (or uses the median, but this provides different values!).
The exact meaning of rejection must be clarified. Here for example, the total number of recognition attempts before the final assessment of a failed recognition play a role. There are systems, which can continuously process a verification in real time. Here a verification time slot is offered.
Many biometric systems reject a verification due to poor picture quality (e.g., dirty or worn down fingers in a fingerprint verification, noisy surroundings in a voice recognition, poor lighting in a facial recognition, or sensor problems). When such problems are not due to a faulty operation, rejections due to picture quality problems are still false rejections. The user is indifferent to the reason for false rejections.
Even the personal FRR can vary with time. It sinks, for example, when one frequently uses the system, which can learn to avoid false rejections. In such cases, it is only reasonable for comparisons to determine FRR during learning phases.
In the case that a liveness/fake recognition is also used, this needs to be considered when determining the FRR.
Q. How is FRR defined in detail? |
Due to the statistical nature of the false rejection rate, a large number of verification attempts have to be undertaken to get statistical reliable results. The verification can be successful or unsuccessful. In determining the FRR, only fingerprints from successfully enroled users are considered. The probability for lack of success (FRR(n)) for a certain person is measured:
FRR(n) = |
Number of rejected verification attempts for a qualified person (or feature) n Number of all verification attempts for a qualified person (or feature) n |
These values are better with more independent attempts per person/feature. The overall FRR for N participants is defined as the average of FRR(n):
FRR = |
1 N |
|
FRR(n) |
The values are more accurate with higher numbers of participants (N). Alternatively, the median value may be calculated.
Important: the determined FRR includes both poor picture quality and other rejection reasons such as finger position, rotation, etc. in the reasons for rejection. In many systems, however, rejections due to bad quality are generally independent of the threshold. The FRR after quality filtering is similarly defined:
Number of rejected "qualified" attempts Total number of "qualified" attempts |
An FRR defined as such, generally yields better data sheet values, but these lower numbers are not reflected in reality from a user's perspective.
Finally, the result of a verification attempt has to be defined exactly:
A verification attempt is successful if the user interface of the application provides a "successful" message or if the desired access is granted.
A verification attempt counts as rejected if the user interface of the application provides an "unsuccessful" message.
In cases of no reaction, a verification time interval has to be given to ensure comparability. If the time interval has expired the verification attempt is counted unsuccessful.
Q. What needs to be considered in the definition of FAR? |
Similar to the FRR, the false acceptance rate can be defined differently.
The FAR is a statistical value, whose measurement accuracy depends on the number of measurements. The FAR depends not only on the biometric system, but on the user as well. There is also a personal FAR. If one wants to deal with large numbers of people, it is important that one individual does not negatively affect the end result. Such could occur when the number of attempts per person differs. This problem can be avoided, if one first identifies each personal FAR curve and calculates the mean from those (or uses the median, but this provides different values!). In determining FAR, it is generally easier to limit the number of recognition attempts to 1 per person. Further attempts per person will smooth out the ROC graph, but add little to the statistical significance.
If the biometric system has picture quality management, which happens to reject a false user due to poor picture quality (click here for example) already before verification, this is of course a correct rejection, and leads to an improved FAR.
Strong behavioral biometric features (e.g., voice or signature) are often purposefully forged or copied. In investigating FAR, it needs to be determined whether tests simply recognize foreign features or also attempted forgeries. This difference can be serious.
Q. How is FAR defined in detail? |
Due to the statistical nature of the false acceptance rate, a large number of fraud attempts have to be undertaken to get statistical reliable results. The fraud trial can be successful or unsuccessful. The probability for success (FAR(n)) against a certain enroled person n is measured:
FAR(n) = |
Number of successful independent fraud attempts against a person (or characteristic) n Number of all independent fraud attempts against a person (or characteristic) n |
These values are more reliable with more independent attempts per person/characteristic. In this context, independency means that all fraud attempts have to be performed with different persons or characteristics! The overall FAR for N participants is defined as the average of all FAR(n):
FAR = |
1 N |
|
FAR(n) |
The values are more accurate with higher numbers of different participants/characteristics (N). Alternatively, the median value may be calculated.
Whether a correct rejection is due to poor picture quality or really to a person's unauthorized status, remains (just like in practice) extraneous.
The crucial number for the determination of statistic significance is the number of independent attempts. Obviously, two attempts in which alternately one person is the reference and another places the request, are not independent of each other. Likewise, multiple attempts from one unauthorized user are considered dependent and therefore have less meaning for statistical significance.
Finally, the following items have to be settled, or defined, respectively:
What is a fraud attempt?
How is the result of a fraud attempt defined exactly?
Usually, during FAR determination, a fraud attempt is an attack using the characteristics of non-authorized persons. This, however, pretends a high security which may not be present since there are a lot of further possibilities for promising attacks.
A fraud attempt is successful if the user interface of the application provides a "successful" message or if the desired access is granted.
A fraud attempt counts as rejected if the user interface of the application provides an "unsuccessful" message.
In cases where no "unsuccessful" message is available, a verification time interval has to be given to ensure comparability. If the verification time interval has expired the fraud attempt is counted unsuccessful.
Q. How is the probability distribution function measured for a biometric system's authorized and unauthorized users? |
In order to investigate the performance of a biometric verification system, one looks at how the system reacts to a large number of inquires for biometric features from authorized as well as unauthorized users. Due to natural fluctuations and measurement imperfections, the results of such an investigation are never absolutely certain, instead are only predictable to a certain extent. In order to determine the error rates, "false acceptance" and "false rejection," the yes/no decisions of "authorized/unauthorized" are not used, instead the underlying degree of similarity between an inquiry and the saved reference feature. In a series of measurements, similarity ratings ("score values") are collected for authorized and unauthorized users. Then the frequency of incidence is counted for every similarity rating. After being normalized with the total number of inquiries, both resulting histograms make up an approximation to the probability distribution function. They show the measured estimation of a certain similarity rating's (n) probability of occurring for authorized users (pB(n)) and unauthorized users (pN(n)):
pB(n) ~ |
Number of measurements with similarity rating n for authorized user Total number of measurements for authorized users |
pN(n) ~ |
Number of measurements with the similarity rating n for unauthorized Total number of measurements for unauthorized users |
The higher the total number of measurements, the more accurate the estimation. (See "Statistical Significance" . A mathematical determination of probabilities as a relationship between the relevant possibilities and the total number of possibilities fails because as opposed to dice, there are simply too many different possibilities to be able to include.)
In an ideal case (unfortunately unachievable), both distribution curves do not overlap. That means, inquiries for unauthorized users have the low similarity ratings, whereas all the high similarity ratings are for authorized users. In such a case it is easy to define a decision threshold, that clearly differentiates between authorized and unauthorized users. In practice, however, there is always an overlap when the number of users is high enough. Here comes a typical diagram:

Q. How do the FAR/FRR paired graphs affect a biometric system? |
The error graphs of FAR and FRR are respectively defined as the probability that an unauthorized user is accepted as authorized, and that an authorized user is rejected as unauthorized. The curves are dependent upon an adjustable decision threshold for the similarity of a scanned biometric characteristic to a saved reference. The following derivations apply under the assumption that a similarity rating value can be any whole number between 0 and K, and that, for simplicity's sake, the probability of value K occurring is 0. It also makes sense in practical applications, when we first consider the FMR and the FNMR and later extract the threshold-independent rejections due to insufficient image quality from the FAR and FRR. Furthermore, we assume that for acceptance the coincidence of two features and for rejection the non-coincidence is required.
If a general probability distribution function p is given for discrete similarity values n, the probability PM(th) that the scanned biometric characteristic with similarity rating n falls below threshold th ("misses") is:
|
|
|
|
|
|||
PM(0) |
:= |
0 |
|
|
|||
PM(th) |
= |
|
p(n) |
th = 1, 2, 3, ..., K |
The sum of correct matches and mismatches must equal the number of total events. For that reason, the probability PH(th) that the similarity rating of the scanned trait reaches or exceeds threshold th ("hits") will be:
PH(th) = 1 - PM(th) = |
|
p(n) |
th = 0, 1, 2, ..., K |
The False Match Rate FMR(th) is an estimation to the probability that the similarity of two non-identical features does not reach or exceed a certain threshold value th. Therefore:
FMR(th) ~ PH(th)= 1 - |
|
pN(n) |
th = 1, 2, 3, ..., K |
For the False Non-Match Rate FNMR (th), applies the analogous:
FNMR(th) |
~ PM(th) = |
|
pB(n) |
th = 1, 2, 3, ..., K |
where pN is the probability frequency function for non authorized users and pB is for authorized users. The approximation (~) indicates that only the expected value of the measured failure rates FMR and FNMR are identical with the probabilities PH resp. PM. The limit values are:
FMR(0) = 1 |
|
FMR(K) = 0 |
|
|
|
FNMR(0) = 0 |
|
FNMR(K) = 1 |
To calculate FAR and FRR, the threshold-independent quality rejection rate QRR (equals FTA, depending on definition) has to be taken into consideration. Provided that a false acceptance is assigned to a false match, we obtain:
FAR(th) = (1 - QRR) FMR(th) |
|
FRR(th) = QRR + (1 - QRR) FNMR(th) |
For the border values we then get:
FAR(0) = 1 - QRR |
|
FAR(K) = 0 |
|
|
|
FRR(0) = QRR |
|
FRR(K) = 1 |
Setting a similarity rating th as the threshold to differentiate between authorized and non authorized users, results in the experimental estimation of false acceptance rate FAR(th), as the number of similarity ratings of non authorized users that fall above this threshold in comparison to all trials / number of similarity ratings. Conversely, the false rejection rate FRR is the number of authorized user's similarity ratings which fall below this same threshold compared with the total inquiries. Through integration (in practice, successive summation) of the probability distribution curves, FAR and FRR graphs are determined, which are dependent on the adjustable adopted threshold th. The following diagrams show typical results in linear and logarithmic scale:


Q. How does one determine the Receiver Operating Characteristic (ROC) of a biometric system? |
The FAR/FRR curve pair is excellently suited to set an optimal threshold for the biometric system. Further predictors of a system's performance, however, are limited. This is partially due to the interpretation of the threshold and similarity measures. The definition of the similarity measures is a question of implementation. Almost arbitrary scaling and transformations are possible, which affect the appearance of FAR/FRR curves but not the FAR-FRR values at a certain threshold. A popular example is the use of a "distance measure" between the biometric reference and the scanned biometric features. The greater the similarity, the smaller the distance. The result is a mirror image of the FAR/FRR curves. A favorite trick is to stretch the scale of FAR/FRR curves near the EER (Equal Error Rate: FAR(th) = FRR(th)), (i.e., using more threshold values) thus making the system appear less sensitive to threshold changes.
In order to reach an effective comparison of different systems, a description independent of threshold scaling is required. One such example from the radar technology is the Receiver Operating Characteristic (ROC), which plots FRR values directly against FAR values, thereby eliminating threshold parameters. The ROC, like the FRR, can only take on values between 0 and 1 and is limited to values between 0 and 1 on the x axis (FAR). It has the following characteristics:
The ideal ROC only have values that lie either on the x axis (FAR) or the y axis (FRR); i.e., when the FRR is not 0, the FAR is 1, or vice versa.
The highest point (linear scale under the definitions used here) is for all systems given by FAR=0 and FRR=1.
The ROC cannot increase
As the ROC curves for good systems lie very near the coordinate axis, it is reasonable for one or both axis to use a logarithmic scale:

Remark: Instead of "ROC", sometimes the term "DET" (Detection Error Tradeoff) is used. In those cases, the term "ROC" is reserved for the complimentary plot 1 - FRR against FAR.
Q. How does a transition from verification to identification affect the FAR? |
In a verification a biometric feature is compared with only one reference, whereas in an identification, it is compared with N (N>1) different references. This transition to an identification results in higher FAR, and in an ideal case is as follows:
FARN = 1 - (1 - FAR1)N |
where FARN is the false acceptance rate for N different stored references. The formula is restricted to the "access control" case where the correct assignment to an identity is not essential. For an N·FAR1 significantly smaller than 1, we have approximated:
FARN ~ N·FAR1 |
Example: A data base has 100 000 different references. In an identification, FAR is raised from 10-7 to about 10-2!
If in an application the correct assignment of ID data is essential (e.g., for bank transactions), other methods have to be used, as explained under Determination of FIR.
Q. How does a transition from verification to identification affect the FRR? |
During identification the recognition biometric features are compared to all references. Obviously, in contrast to a verification, more than one similarity value (score) is generated. This fact complicates the decision, whether a biometric characteristicis to be accepted, or not. In particular, there are multiple ways to decide, if, e.g., several scores exceed a threshold. As a result, each decision procedure needs its own definition for a false rejection. Two examples are given:
One must differentiate between applications which allow access to personal data after a successful identification (e.g., access to a personal bank account), and applications which grant general access not dependent on one's identity (e.g., entrance to a room without a protocol of an identified person's presence). In the first case an assignment of a biometric characteristic to a false identity may happen. This is called a false identification, characterized by the False Identification Rate FIR. Furthermore, it is conceivable that more than one reference template will generate a score above the threshold. This case is treated in Determination of FIR, showing that different decision strategies may yield different results.
In the second case, with increasing numbers of different references, the false rejection rate FRR decreases! How can that be? Very simply: it increases the probability that a justified user is "identified" not only from his or her own personal features, but also those of others, as normally would be considered a false acceptance. The user, however, does not notice the system's mistake. Mathematically, under ideal conditions this appears:
FRRN = FRR1(1-FAR1)N-1 |
||
How is the False Identification Rate (FIR) calculated? |
||
During an identification, the recognition biometric features are compared to many references and possibly, the similarity value will exceed the threshold for more than one reference. This is non-critical if only granting access, but can be very problematic if the correct assignment of personal data to the biometric characteristic is required (Example: access to a bank account via ATM).
The probability for the identification of further (by definition false) candidates (independent of the correct reference) can be calculated from the FAR since these candidates would represent false acceptances in the case of verification. Its value is given by:
1 - (1 - FAR1)N-1 ~ (N - 1) FAR1 |
whereby FAR1 is the False Acceptance Rate for a system with one reference. N represents the number of references. The approximation (right side) applies in the case that the resulting value lies considerably under 1.
The False Identification Rate can first be calculated after selecting one of the candidates. One standard, which is often found in practical applications, could be, for example, that the candidate with the highest similarity value is chosen (presuming that there is only one). Unfortunately, the FIR is only ascertainable when the probability density functions are available for false acceptance as well as false rejection.
Easier to calculate is the rule that multiple candidates are completely rejected, which raises the FRR and lowers FAR. The following definitions apply here:
FAR |
|
probability that a non-authorized person is identified |
FRR |
|
probability that an authorized person is not identified |
FIR |
|
probability that an authorized person is identified, but is assigned a false ID |
These definitions result in the following formulas under ideal conditions (statistic independence, same error rates for all people, ...); where the index N is again the number of references:
FARN = N FAR1 (1 - FAR1)N-1 |
FRRN = 1 - (1 - FRR1 - FAR1 + N FRR1 FAR1) (1 - FAR1)N-2 |
FIRN = (N - 1) FRR1 FAR1 (1 - FAR1)N-2 |
||
|
||
Q. When are FAR and FRR values statistically significant?
A value is considered statistically significant when it is likely that is falls within a given error interval and the probability of falling outside this area by chance is relatively low. Statistical significance is dependent upon the number of trials or sample size. Because biometric values are difficult to model, the existence of statistical significance is hard to estimate. As a rule of thumb ("Doddington's rule"), one must conduct enough tests that a minimum of 30 erroneous cases occur [Porter 1977]. Example: An FAR of 10-6 can be considered reliable, when 30 errors occur in 30 million trials. One error in a million trials also has an FAR of 10-6, but statistically is far less significant. One can see that biometric tests are very expensive if performance needs to be very high. The situation would be easier, if further information could be considered along with the yes/no questions (or accept/reject), as for example the proximity of a decision to the acceptance threshold.
Q. What is essential when comparing the ROC performance of biometric systems? |
The accuracy performance of a verification system can be determined by exactly three statistical quantities: FAR, FER, and FRR. Since these three quantities influence each other when parameters (e.g., quality acceptance thresholds for enrolment and authentication) are changed, a comparison of one quantity between two systems makes only sense when the other two quantities are mutually equal. For example, let the FARs of different systems be compared. Then the corresponding FRRs must be equal, and the FERs must be equal, too. Regarding a ROC diagram, this condition can be easily fulfilled for all FRRs for which the curve has been measured, provided that the FERs of all curves are constant and the same. However, this is often violated since the FERs are actually different!
A solution to this problem comes from the procedure used, e.g., in the Fingerprint Verification Competition FVC2002, where different algorithms for fingerprint recognition have been tested. The idea is to consider a failure-to-enrol case as a virtual "FTE user" with the properties:
If the virtual FTE user tries a (virtual!) authentication, the result is always a rejection, thus increasing the FRR.
If an impostor tries an authentication attempt against a virtual FTE user, always a rejection is supposed, thus decreasing the FAR.
This way, the FER is eliminated and the ROC curves as well as the FAR/FRR values are forced to become comparable. Mathematically, we implement this method by introducing a Generalized FRR (GFRR) and a Generalized FAR (GFAR). (It will be a matter of standardization to fix these terms. Here they are used until standardization is finalized.) The calculation of GFRR and GFAR is quite simple, if we assume that each authentication trial is preceded by its own enrolment trial. This should make sense because authentication performance is not independent of enrolment: a good enrolment delivers better FRR values than a worse one. Therefore it seems to be statistically more accurate not to base a whole FRR statistics on a single enrolment!
GFAR(th) = (1 - FER) FAR(th) |
|
GFRR(th) = FER + (1 - FER) FRR(th) |
Here (th) denotes the dependency on the decision threshold parameter th which is assumed to range between 0 and K (arbitrary), see "How do the FAR/FRR paired graphs affect a biometric system?". These formulas show a strong relationship to those derived for FAR and FRR when including the FTA (Failure-to-Acquire).
Similarly, we get for the border values:
GFAR(0) = (1 - FER)(1 - QRR) |
|
GFAR(K) = 0 |
|
|
|
GFRR(0) = FER + (1 - FER) QRR |
|
GFRR(K) = 1 |
Both formulas are symmetric in QRR (= FTA) and FER (= FTE), showing the strong relationship between Failure to Enrol and Failure to Acquire. In some cases these two values are even equal. This happens when the biometric system uses the same quality rejection mechanisms and levels for enrolment and for authentication. In practice, higher quality requirements during enrolment, leading to a higher FTE, might be quite reasonable to prevent enrolment of nonsense features. Furthermore, too low an enrolment quality will decrease usability of the authentication systems in daily use. In many applications it is better to spend more time during enrolment than losing time by multiple authentication trials.
A ROC diagram using GFAR and GFRR will be called Generalized ROC (GROC) diagram for consistency.
Q. What does separability of a biometric system mean? |
The Receiver Operating Characteristic (ROC) offers an objective comparison of different biometric systems, in the form of a graph. More practical would be the specification of one single measured value, which forms a kind of average of all the systems settings. Therewith, only a global description of the system would be possible. One must therefore understand that a system can be better overall, despite worse local functioning, for example in an operating point.
Separability is intuitively the ability of a biometric system to differentiate authorized and unauthorized users on the basis of a biometric feature. The higher the separability, the fewer the errors while differentiating authorized and unauthorized users. The measure of the separability, like that of the ROC, cannot be dependent on implementation specific scales. Additionally, a separability measure should be easy to calculate.
A well known measure for the (inverse) separability is the Equal Error Rate (EER). Unfortunately, the EER describes only one single point of the ROC. While the definition is simple, the calculation is not so easy; the EER point does not exist as a measurement, instead it is derived through decision and approximation.
An (inverse) separability measure, which also prevents the EER disadvantages, is the area below the ROC graph. It allows easy calculation of all ROC values through summation. The only difficulty is the fact that the ROC values are not equidistant. Therefore, every y value (FAR) must be weighted by the distance between its corresponding x value (FRR) and the next value. This distance for every ROC point is just the difference (that is, the gradient) of two consecutive values in the FAR graph. As a result, the distance is given by the probability distribution graph of non authorized users. (For continuous functions, in which the sum can be replaced by an integral, this would be a consequence of the substitution rule for integrals!) The ROC area, here called ROCA, is (K+1 is the number of similarity ratings considered):
ROCA = |
|
FRR(n)pN(n-1) |
pN: Probability distribution function |
This formula simply needs additions and multiplications of existing measured values. Even though implementation specific similarity ratings n are summed, the ROCA is still independent of their definition. However, one must assume that no threshold-independent rejections occurs, i.e., FRR = FNMR and FAR = FMR.
Both EER and ROCA can take on values between 0 and 1. Ideal separability of a biometric system and therewith the distribution pB and pN obviously result in EER and ROCA values of 0. But what value belongs to the ideal non separability. Intuitively, ideal non separability can only mean that both distributions pB and pN are exactly the same. But in the case:
pN = pB |
=> |
FAR = 1 - FRR |
=> |
EER = ½ |
and:
pN = pB |
=> |
ROCA = |
|
FRR(n)pB(n-1) |
~ ½ |
(Proof for the approximation: one replaces the sum with an integral and considers pB as the derivative of FRR. Now, only the rules for partial integration are needed.)
Reasonable vales for EER and ROCA lie between the extremes: 0 for perfect separability and ½ for perfect non separability. What do values between ½ and 1 then mean? This range is left for cases, in which distributions pB and pN trade roles and change places in the diagram. For separability, this range has practically no meaning in biometrics.
Q. What does one need to be aware of regarding the FAR/FRR? |
The measurement of biometric features as well as the features themselves are subject to statistical fluctuations. Therefore, every biometric recognition system has a built-in acceptance threshold, which when raised both decreases FAR and increases FRR. It should be clear that the given FAR and FRR values are belonging to the same threshold value. Stating only the FAR or only the FRR is thus misleading.
Additionally, even the Failure-to-Enrol Rate FER must be considered when comparing the FAR/FRR values of different systems. This is because the enrolment procedure can be parametrized in such a way that only best quality biometric features are approved for biometric templates while lower quality samples are dropped, thus contributing to a higher FER. Normally, the higher the FER forced by the biometric system, the better the FAR and FRR values, and vice versa!
In biometrics FAR/FRR are not theoretically ascertainable, instead they must be determined statistically in costly tests. Determining statistical significance is equally difficult. There were no standardized techniques, therefore results could vary due to differences in test conditions and sample size. Clarity was only provided by disclosure of the test conditions.
Q. Is a biometric system's performance dependent upon the user? |
Generally, yes. This applies for false acceptance rate (FAR) as well as for false rejection rate (FRR). We experience this in our everyday lives -- some faces are easy to recognize and remember, whereas others are difficult. Therefore, the statistical means of FAR and FRR, typical indicators, are not very helpful for individual users. This dependence on the individual user is also responsible for the fact that statistical properties of FAR and FRR measurements are very difficult to quantify.
Q. Is Failure to Enrol a typical problem for biometric systems? |
Every biometric characteristic can occasionally or permanently fail. Examples of temporary failures can be caused by worn down or sticky fingertips for fingerprints, medicine intake in iris identification (Atropin), hoarseness in voice recognition, or a broken arm affecting one's signature. Well known permanent failures are, for example, cataract, which makes retina identification impossible, or rare skin diseases which permanently destroy a fingerprint. Therefore, every biometric system needs a fall-back process. One also needs a fall-back if a key is lost or a PIN is forgotten; so not only are biometric systems affected by user failure, rather all authentication systems. In fact one can see that also here, biometric systems are preferable to conventional methods.
Q. How are the FAR and FRR minimized in a biometric system? |
The false acceptance rate (FAR) can be adjusted in the recognition algorithm via the acceptance threshold - the higher the acceptance threshold, the lower the FAR. Raising the acceptance threshold, however also raises the FRR. Therefore, the goal must be to have as small an FAR as possible for any given FRR, and vice versa. There are certain factors which primarily influence the FAR, while others mainly affect the FRR. For a fixed FRR, FAR is dependent on the following factors:
type of biometric feature
quality of the sensors
user behavior
effectiveness of the recognition algorithm
the number of biometric references in an identification system
Therewith, the optimization possibilities are clear:
determine suitable biometric characteristics: here the uniqueness of the biometric characteristics essentially affects the FAR, whereas permanence and measurability affect the FRR
choose the sensor with the best (picture) quality: this mainly reduces the FRR
eliminate false operations of the user: this also reduces the FRR
optimize the recognition algorithm
limit the number of biometric references in an identification system: this reduces the FAR and increases the FRR
Q. Is the Equal Error Rate a robust measure for system performance? |
No. Using the threshold parameter, most practical biometric systems are not adjusted for FAR = FRR which defines the EER but for FAR << FRR. Since ROCs of different systems may behave completely different, two systems with the same EER may even differ by decades for other ROC points. To avoid such large errors, only the FAR - FRR pairs in the operating point are to be considered, e.g., by comparing the FARs at a common FRR. A consideration of the EER is only reasonable in those rare cases where the system uses the EER as operating point.
Source: http://www.bromba.com/faq/biofaqe.htm:

|