Several kinds of chemical formulae may be attached to a sample record:
· Empirical formula: this is the observed empirical formula of the sample. It is suggested that every sample record contain one of these. Coefficients and subscripts may be expressed as whole numbers or as decimals.
· Empirical formula – less solvent: similar to the standard Empirical Formula above, this kind of formula is different only in that any solvent of crystallization has been removed from the formula. Thus, only the “pure” formula of the substance is left. Such a formula is useful mainly for samples that are intended for release to a “general science” audience, such as those in Reciprocalnet.org’s Common Molecules collection.
· Empirical formula – single ion: similar to the standard Empirical Formula above, but this kind of formula represents only one (presumably complex) ion in a salt. Like the Empirical formula – less solvent, such a formula is useful mainly for samples that are intended for release to a “general science” audience.
· Empirical formula – derived: like the Empirical formula – less solvent above, this kind of formula is not a true empirical formula and is useful mainly for samples that are intended for release to a “general science” audience.
· Structural formula: a chemical formula written in a way that reflects some of the molecular structure details. Structural formulae should represent exactly the same cumulative element counts as the empirical formula for the sample (whether or not an empirical formula item is included among the sample data).
· Moiety formula: a chemical formula that describes the way the empirical formula is divided among discrete, covalently-bonded chemical moieties, but without specific structural detail. Moiety formulae should represent exactly the same element counts as the empirical formula for the sample (whether or not an empirical formula item is included among the sample data).
There is no limit enforced as to the number of chemical formulae of each type that may be attached to a sample record, but in most contexts it makes sense for there to be only a single formula of each type. If there are multiple formulae of a given type then Reciprocal Net reserves the right to discard all but one during certain operations not yet defined.
Reciprocal Net recommends that a specific chemical formula syntax be observed by all partner labs in order to facilitate reliable searching across various labs’ collections and improve the display of chemical formulae (formulae stored in standard format are displayed on web pages with appropriate subscript and superscript tags). Reciprocal Net recommendations are based on CIF conventions (defined by the Crystallographic Information File format specification), including these rules:
· No chemical symbols other than recognized element symbols may be used.
· Each element symbol should be followed by a positive, numeric (not necessarily integer) count; a count of 1 may be omitted. (As specified below, a count of 1 should be omitted).
· A space or parenthesis must separate each cluster of element and count from any next and previous ones (but parentheses are valid only where specified below).
Going beyond the CIF conventions, Reciprocal Net recommends that these additional rules be observed:
· It is assumed that the numeric count for any element symbol or group is 1, unless the count is specified. A numeric count of 1 always should be omitted.
· Except in moiety formulae, any multiplier for a parenthesized group must follow the closing parenthesis (i.e. no pre-multipliers except in moiety formulae).
· Except in structural formulae, elements are listed according to the Hill system of Chemical Abstracts: if carbon is present then C should be first, followed by H, followed by all remaining elements in alphabetical order by chemical symbol; if carbon is not present then all elements should be in alphabetical order by chemical symbol. In moiety formulae this rule applies to each moiety individually.
· For those formulae that allow grouping (currently moiety formulae and structural formulae), the only characters that may be used for grouping purposes are parentheses.
· In addition, for moiety formulae:
o Moieties are to be separated from each other by commas.
o Parentheses may not appear inside moieties, but may surround them.
o Applicable non-zero charges should be specified at the end of each moiety in the form of a + or – separated from the preceding element symbol and count by a space, and immediately preceded by the appropriate count if that is greater than 1.