Kelvin is written in ANSI C, with some supporting scripts in Perl. Kelvin can generally be expected to run on any relatively recent Linux or Unix distribution. It has been successfully built with various versions of the GNU C Compiler (GCC) and the Intel C Compiler (ICC).
Kelvin releases have historically been tested on the following platforms:
Kelvin may make very extensive use of memory management, and can, under most circumstances, definitely benefit from a drop-in allocator such as Hoard or ptmalloc3. Either of these can easily halve execution time, and will keep memory fragmentation down when running in multi-threaded mode, but they are not required. These are controlled by the compilation conditionals USE_PTMALLOC
and USE_HOARD
; Hoard usage is disabled by default; ptmalloc3 usage is enabled if it is available on your platform and otherwise disabled by default.
Some unsupported features of Kelvin are normally disabled for distribution purposes. They can be restored by removing -DDISTRIBUTION from FILE_CFLAGS in Makefile.main. Note that this is UNSUPPORTED; you do this at your own risk!
Kelvin typically takes all of its configuration information from a single file specified on the command line. This file is composed of directives that describe the analysis to be performed, and the locations of supporting data files. We provide a complete reference to Kelvin directives and several examples at the bottom of this document
Kelvin supports a wide variety of analyses and options. These can be broken into general categories, with a small number of possibilities for each category. Some analyses/options are compatible with other analyses/options, some are not.
Two-point analysis is the default. Multipoint analysis is enabled with the Multipoint directive. Multipoint analysis is incompatible with linkage disequilibrium and marker-to-marker analyses.
Linkage equilibrium is the default for two-point analyses, and is the only option for multipoint analyses. Linkage Disequilibrium can be enabled for two-point analyses with the LD directive. LD analyses with microsatellites are normally disabled for distribution versions of Kelvin.
A dichotomous trait model is the default. A quantitative trait model can be specified with the QT directive. A quantitative trait model with a threshold can be specified with the QTT and Threshold directives.
By default, Kelvin will perform its calculations using the sex-averaged centiMorgan marker positions in the map file. If sex-specific marker positions are available, Kelvin can be made to use those with the SexSpecific directive. Sex-specific maps are not supported for LD analyses.
Kelvin will ignore the possibility of imprinting (parent-of-origin) effects by default. Imprinting effects can be allowed for by specifying the Imprinting directive.
Trait-to-marker analysis is the default, and considers the relationship between a hypothetical trait locus and a marker or group of markers. Marker-to-marker analysis is enabled with the MarkerToMarker directive, and considers the relationship between pairs of markers only.
By default, Kelvin does not condition penetrances (or QT parameters) on covariates. Kelvin can be made to allow covariate-dependent penetrances (or QT parameters) with the LiabilityClasses directive, and appropriate entries in the pedigree and locus files, whieh assign individuals to different liability classe. Covariate dependence does not currently work with epistasis analyses.
By default, Kelvin conducts single locus analysis. The Epistasis directive will provide a two-locus model in which penetrances at one locus are allowed to depend upon genotypes at a specified marker. Currently, this marker must be a SNP (two alleles only), and epistasis analysis does not work with additional covariate dependence.
By default, Kelvin will attempt to count and bin pedigrees to reduce computation. It does this by examining the pedigree file prior to analysis, and identifying combinations of pedigree structure, phenotype and genotype that appear more than once. Each unique combination need only be analyzed once, thus reducing the total number of computations. This mechanism will realize the greatest gains for datasets containing mostly small pedigrees: cases-and-controls, trios and affected sib-pairs.
Automatic pedigree counting will be implicitly disabled if any of the following are true:
Automatic pedigree counting can be explicitly disabled with the SkipPedCount directive.
Kelvin typically requires that allele frequencies be provided using the FrequencyFile directive. However, Kelvin will estimate allele frequencies internally in specific circumstances:
If Kelvin can estimate allele frequencies, it will use those allele frequencies in preference to any provided using the FrequencyFile directive. Automatic allele frequency estimation can be explicitly disabled with the SkipEstimation directive.
Kelvin will exit with error if allele frequency estimation is not possible and the FrequencyFile directive is not present.
Kelvin typically requires four input data files: a pedigree file, a locus file, a map file, and a frequency file. The frequency file may be omitted if Kelvin is to estimate allele frequencies internally.
Kelvin will accept either a pre- or post-MAKEPED format pedigree file. MAKEPED is part of the LINKAGE package. The only exception is that for pedigrees that contain loops, or if the proband must be explicitly specified, then a post-MAKEPED pedigree file is required. While it is possible to create a post-MAKEPED pedigree file by hand, it's much easier to create a pre-MAKEPED pedigree file, and run it through MAKEPED. Each line of a pre-MAKEPED pedigree file must start with five columns:
1
(one) if the current individual is male, or a 2
if female.The remaining columns in the pedigree file are governed by the locus file. The pedigree file may contain any combination of cases-and-controls, nuclear families (defined as two founder parents and one or more children) or general pedigrees.
This file may be referred to as a 'data file' in documentation for older versions of Kelvin. Each line in a locus file consists of an identifier character, and a label. Each line corresponds to one (or two) columns in the pedigree file. The locus file defines the number and order of columns in the pedigree file. The identifier characters, and their meanings, are as follows:
A
- Phenotype status/trait value. Corresponds to a single column in the pedigree file, which indicates if individuals are affected, unaffected, unknown, or (for quantitative trait analyses) the trait value. Trait-to-marker analyses must have a phenotype status/trait line in the locus file, and the trait label will appear in all output files. A phenotype status/trait line need not appear for marker-to-marker analyses. The specific values that indicate affected, unaffected, etc., are controlled by the PhenoCodes configuration directive.T
- Used interchangeably with A
in Kelvin.C
- Covariate (Liability Class). Corresponds to a single column in the pedigree file. The values in this column are used to group individuals into liability classes to allow for covariate dependence of the penetrances (or QT parameters).M
- Marker. Corresponds to two adjacent columns in the pedigree file. The values in these columns indicate the alleles of the named marker for each individual. Every marker that is listed in the locus file must also appear in the map file and the frequency file.Kelvin currently requires that, if a phenotype status column is present, it must appear first, before any markers. If a covariate column is present, it must appear immediately after the phenotype status column, before any markers.
This file lists the positions of markers on the chromosome, in Haldane centiMorgans (Kosambi is implemented but unsupported). The first line of the file may optionally explicitly indicate the map function in use:
1 |
|
The rest of the file must consist of three (or more) columns, each identified by a column header on the first (after the optional mapFunction) line. The column headers are as follows:
CHROMOSOME
or CHR
- The chromosome number. Kelvin currently does not allow map files to contain markers from multiple chromosomes.MARKER
or NAME
- The marker name. Marker names may contain numbers, letters, underscores and hyphens only.POSITION
or POS
or HALDANE
or KOSAMBI
- The sex-averaged centiMorgan position of the marker. If the column header is HALDANE
or KOSAMBI
, the map function will be set accordingly.MALE
or MALEPOSITION
- The male sex-specific centiMorgan position of the marker. This column is only required for sex-specific analyses, and will be ignored otherwise.FEMALE
or FEMALEPOSITION
- Same as MALE
, but for females.PHYSICAL
or BASEPAIR
- The basepair position of the marker. This column is optional. If present, this value will appear in some output files.If the map function is specified by both a mapFunction
line and by a column header, the same map function must be specified. If the map function is specified by neither, the default is Haldane. Kosambi is only permitted if the AllowKosambiMap directive is present in the configuration file; otherwise, if Kosambi centiMorgans are specified, Kelvin will exit with an error.
This file identifies the alleles for each marker, and the specifies the frequencies for those alleles. This file is only required if Kelvin cannot estimate allele frequencies internally. The file is composed of a series of line groups. Each group begins with a marker line, followed by one or more allele lines. Similar to the locus file, each line starts with an identifier character:
M
- Marker line. Should be followed by the marker name only. Starts a new line group, and implicitly finishes any previous line group, if any.F
- Unlabeled alleles. Must be followed by only space-separated frequencies. This implicitly creates alleles labeled with consecutive numbers, starting with '1'. Multiple F
lines may appear in a line group; the implied labels will be consecutive with implied labels from previous F
lines.A
- Labeled alleles. Must be followed by an allele label and the frequency for that allele. Each allele must be specified on a separate line.For any given marker, alleles must all be specified as labeled or unlabeled, although different markers in the same frequency file may use either method.
Pedigree File (pre-MAKEPED):
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16
1 2 3 4 5 |
|
Locus File:
1 2 3 4 5 6 |
|
In this example, we have a small three-generation family. The pedigree ID (212
) is in column 1, and individual IDs (101
through 105
) are in column 2. Note that individual IDs do not need to be consecutive. The father and mother IDs are in columns 3 and 4, respectively. Individuals 101
, 102
and 104
are founders (their parents are not present in the pedigree), so their father and mother IDs are coded with 0
(zeros). Individual 103
is descended from individuals 101
and 102
; individual 105
is descended from individuals 103
and 105
. Sex indicators are in column 5; individuals 101
, 103
and 105
are males, and individuals 102
and 104
are females. The remaining columns are defined by the locus file.
Line 1 of the locus file specifies a phenotype status column, which corresponds to column 6 in the pedigree file. This column follows the typical convention of dichotomous traits, coding unaffected individuals with a 1
(one), affected individuals with a 2
and unphenotyped individuals with a 0
(zero).
Each of the remaining lines in locus file specify markers, and correspond to two columns in the pedigree file: locus file line 2 corresponds to pedigree file columns 7 and 8, line 3 to columns 9 and 10, etc. For a fun exercise, spot the impossible inheritance in the sample pedigree.
Installing MAKEPED is beyond the scope of this documentation, but the general command to process a pedigree file with MAKEPED is:
makeped ped.pre ped.post n
where ped.pre
is the pre-MAKEPED file you've created, ped.post
is the post-MAKEPED file you want to create, and the last argument is either y
or n
, depending on if any of the families in the pedigree file do or do not contain consanguinity loops.
Map File:
1 2 3
1 2 3 4 5 6 |
|
The map file contains the minimum three columns: chromosome (column 1), marker (column 2) and sex-averaged centiMorgan position (column 3). The map function is not explicitly specified with a mapFunction
line or by the header of column 3, and is therefore assumed to be Haldane.
Frequency File:
1 2 3 4 5 6 7 8 9 10 11 12 13 |
|
The frequency file contains five line groups, one for each marker. Lines 1 through 3 specify marker rs2112
with four unlabeled alleles, which will be implicitly labeled 1, 2, 3 and 4, with corresponding frequencies of 0.233, 0.377, 0.300 and 0.090. Lines 4 and 5 specify marker SNP_A-90125
with three unlabeled alleles. Lines 6 through 8 specify marker SNP_GO-7188
with two labeled alleles: 2
, with a frequency of 0.866, and 1
, with a frequency of 0.134. Lines 9 through 11 specifiy marker rs8675309
with two labeled alleles. Lines 12 and 13 specify a marker rs1984
with a single unlabeled allele with a frequency of 1.000. Note than in no case are labeled and unlabelled alleles mixed within a single marker.
Once you have installed Kelvin, you can run it from your data directory, where you keep your configuration and data files. Kelvin takes only one parameter, which is the name of the configuration file, e.g.:
Kelvin kelvin.conf
Remember that if you did not specify absolute paths for output files in the configuration file, they will be written relative to your current directory.
It is often important to capture all output from a run into a file so that you may review it more after the run completes, or send it to us for diagnosis. The following command (using sh
/ksh
/bash
syntax) runs Kelvin with all output redirected to a file called kelvin.out
:
Kelvin kelvin.conf > kelvin.out 2>&1
Or, using csh
/tcsh
syntax:
Kelvin kelvin.conf >& kelvin.out
If you do need to send us information for diagnosis, please include the configuration and data files along with the output from the run.
When Kelvin is run, it first displays version, build and configuration information. All messages are prefaced with the current date and time. Messages fall into three categories:
For example:
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 |
|
Lines 1 and 15-19 are progress messages. Line 2 is the Kelvin major version and build information. Next is the compiler verison number (line 3), followed by important compilation conditionals as specified in the Makefile and run characteristics as influenced by environment variables. In the example given, OpenMP support status and the number of threads used (line 5) is displayed.
Following this is a line (7) describing the action to take in order to force an early display of progress messages. You can use this as a sort of "pulse check" to make sure Kelvin is still alive and well and making progress should there be a lull between automatic progress updates. If you are running Kelvin interactively, you can perform this "pulse check" by type CTRL-\
(that means to hold down the CTRL or CONTROL key while pressing the backslash). If you are running under cygwin, you will first need to type stty quit ^C
to make this to work. Note that the aforementioned ^C
is actually the two character sequence of 'carat' (shifted 6) and 'C'. Pressing CTRL-\
sends a SIGQUIT signal to Kelvin, which it interprets as a request for status. If you are running Kelvin as a detached process or in a batch queue, you can send a SIGQUIT to the process by being logged-into the same node as it is running on, and using the kill
command as described in the diagnostic output. Note that the signal number (-3 in the example) can be different from platform-to-platform, and the process ID (26577 in the example) will be different from run-to-run. The "at some risk" bit is because some status information is displayed asynchronously, i.e. regardless of the current context of the evaluation, and has been known, albeit extremely infrequently, to crash the program.
Next, location of the configuration file (line 8) and the analysis and run characteristics as determined from that file (lines 9-12) are displayed. The last of these (lines 13 and 14) are a terse description of the scope and nature of the analysis, for example a multipoint run might display:
1 2 |
|
This indicates that for each of 8 trait loci, PPLs will be computed using the closest 4 markers, integrating over a dynamically-sampled trait space, with all individuals in a single liability class. A two-point analysis will provide slightly different information, e.g.:
1 2 |
|
This indicates that for each of 67 initial D' values, two-point PPLs and LD statistics will be computed, integrating over a dynamically-sampled trait space, with all individuals in a single liability class.
Finally, progress indicators are displayed up through the end of the run.
Kelvin produces several different files of results depending upon the type of analysis performed.
Integrated likelihood ratio (BR) information is produced for all runs. It is written to the file br.out
by default, although that filename is configurable. The first line written to the file is a comment that contains the Kelvin version number, e.g.:
1 |
|
This is included so that results can be associated with the version of Kelvin that produced them, and will allow for variations in algorithms, file formats and data precision. Subsequent lines contain different information for each type of analysis being performed:
For multipoint runs, a single table with a header line is output. The table consists of one row for each trait position. Columns are whitespace-delimited:
Two-point analyses output separate tables for each locus. Each table is prefaced with a comment line that details the chromosome, name and position of the current marker, e.g.:
1 |
|
and consists of one row of whitespace-delimited values for each value of D' (in the case of an LD analysis) and theta. Columns are whitespace-delimited:
Separate PPL information is produced for two-point and marker-to-marker analyses. It is written to the file ppl.out
by default, although that filename in configurable. The first line is a comment that contains the Kelvin version number as with the Bayes Ratio file. The next line is a header, followed by one line of whitespace-delimited values for each marker in the run.
All analyses can optionally write maximizing models to a separate file, named mod.out
by default, although that name is configurable. In all cases, the first line is a comment line that identifies the version of Kelvin that created the file. Note that due to Kelvin's use of an efficient numerical integration algorithm, there is no guarantee that the maximum MOD will occur at the true maximum values of all parameters. The default integration routine can be overridden in order to perform maximization over a fixed grid of user-specified parameter values for precise maximization.
For multipoint runs, a single table with a header line is output. The table consits of one row for each trait position, containing the values of all trait parameters that maximized the likelihood at that position. Columns are whitespace delimited:
n
PV(...) - the maximizing penetrance vector, one for each liability (covariate) class in the analysis. For dichotomous trait runs, it is three (or four, if imprinting effects are being considered) columns of the maximizing penetrance for DD, Dd, (dD,) and dd. For quantitative trait (QT) runs with the normal distribution, it is three (or four) columns of means followed by three (or four) of standard deviations for the maximizing distributions for DD, Dd, (dD,) and dd, followed by the threshold in the case of the QTT model. Quantitative trait runs with the Chi-Square distribution have only three (or four) columns of degrees of freedom followed by the threshold. Values are comma-separated and enclosed in parentheses. Header text reflects actual count and nature of columns. Again, the models reported here may not represent true maximizing models, if the integration algorithm has bypassed the true overall maximum of the parameter space.For two-point analyses, a separate table is output for each marker, or pair of markers, in the case of marker-to-marker analyses. Each table is prefaced with a comment line that details the chromosome, name and position of the marker (or markers). For marker-to-trait analyses, this line is identical to that in the Bayes Ratio file. For marker-to-marker, the line looks like:
1 |
|
Each table is prefaced with a header line. Columns are whitespace-delimited:
n
PV(...) - as described for multipoint maximizing mode files. Not present in marker-to-marker analyses.Sequential updating is a method for combining the results of multiple analyses in post-processing. Sequential updating tools are included in the Kelvin distribution.
This program takes one or more Kelvin-format Bayes ratio files, and produces PPL (and, if appropriate) PPLD statistics. If a single Bayes ratio file is provided, calc_updated_ppl
simply reproduces the PPL statistics generated by the original analysis. If multiple Bayes ratio files are provided, the Bayes ratios are sequentially updated and new PPL statistics are generated.
calc_updated_ppl
can also use the PPL statistics generated by a multipoint analysis as prior probabilities when calculating the PPLD (posterior probability of linkage and LD) statistic. In this way, the results of a linkage analysis can be combined with an LD analysis. Both the linkage and LD analyses must be aligned against the the same map.
By default, calc_updated_ppl
expects Bayes ratio files from two-point, sex-averaged analyses, in the most recent format, and displays PPL statisics. Bayes ratio files created using older (pre-2.1) versions of Kelvin must be converted to the current format using the provided conversion script. All input files must contain Bayes ratios for the same markers, in the same order. The default behavior can be modified with a number of command line flags:
-m
, --multipoint
-s
, --sexspecific
-M [mapfile]
, --mapin [mapfile]
-f
, --forcemap
--mapin
.-R [pplfile]
, --pplout [pplfile]
-O [brfile]
, --partout [brfile]
-P [pplfile]
, --pplin [pplfile]
-r
, --relax
-?
, --help
Here's a simple example of updating across two two-point Bayes ratio files, and saving the updated statistics to a file:
calc_updated_ppl br1.out br2.out > updated-ppl.out
Here's an example of updating across three multipoint Bayes ratio files, and requests that the updated Bayes Ratios be written to the file br-updated.out
:
calc_updated_ppl -m --partout br-updated.out br1.out br2.out > updated-ppl.out
Here's an example of updating across three two-point Bayes ratio files, each of which contains a slightly different subset of the total set of markers. A map file is required in this case:
calc_updated_ppl --mapin complete.map br1.out br2.out br3.out > updated-ppl.out
Here's an example of using multipoint PPLs with a single two-point, LD Bayes ratio file. This just recalculates the PPLD statistics for the given Bayes ratios, but uses the multipoint PPLs as priors for the PPLD statistic (producing the cPPLD statistic):
calc_updated_ppl --pplin mp-ppl.out br.out > cppld.out
This Perl script will convert avghet.out
or br.out
files generated by
older versions of Kelvin into the most recent format. Output is
always to the terminal.
-m [mapfile]
nomap
is passed, instead of a mapfile, then the marker positions will be filled with sequential integers.-c [chrnum]
Here's an example that captures the output to a file:
convert_br.pl -c 12 -m mapfile.dat avghet-old.out > br-new.out
Note in this example, -c 12
specifies that avghet-old.out
contains data from an analysis of chromosome 12, and that marker position information is in the file mapfile.dat
. A full description of command line syntax is available by typing:
convert_br.pl --help
Kelvin's configuration file format is intended to be easy to read and edit. Kelvin, in general, requires the user to explicitly specify what Kelvin is to do, and doesn't make any assumptions about what the user may or may not want. Kelvin will display helpful error messages and terminate if the configuration file contains incompatible directives.
In addition to the configuration file, any valid directive may be specified on the command line by prepending the directive with '--' (two hyphens). Any additional arguments on the command line will be treated as arguments to the directive, up to the end of the line, or the next directive, which again must be prepended with '--'. Directives on the command line that specify input or output files will override the values set in the configuration file. Directives that take a series or range of values (like TraitPositions), will add to the values, if any, specified in the configuration file. There is currently no way from the command line to remove values set in the configuration file.
The kelvin configuration file is a text file containing directives that describe the nature of the analysis to be performed, and the names of input and output files. Directives typically appear on separate lines, but multiple directives, separated by a semicolon (';'), may appear on the same line. Some directives require arguments, which must appear on the same line as the directive they modify. Blank lines in the configuration file are ignored. Extra white space (spaces and tabs) around directives and arguments is ignored. Comments may be embedded in a configuration file with a pound sign ('#'). The pound sign and any text, up to the end of the line, will be ignored. Comments may appear on a line by themselves, or at the end of a line that also contains a directive(s). In the latter case, a semicolon is not required to separate the directive and the comment.
Directives are words or combinations of words with no intervening spaces. This document will capitalize the individual words that make up a directive (for example, MapFile
), but they may appear in the configuration file with any capitalization (Mapfile
, mapfile
, MAPFILE
, etc.). Also, directives may be abbreviated, so long as the abbreviation is unique. For example, MapFile
could be abbreviated to Map
. SexLinked
could be abbreviated to SexL
, but not to Sex
, since that could also be an abbreviation for SexSpecific
.
Most directives require one or more arguments, which will take one of the following forms:
-
', an end value, a literal ':
', and an increment. Start, end and increment values may be integers or real numbers. Start and end values may be positive or negative, with the restriction that the end must be no less than the start. The increment must be a positive value. Intervening whitespace may be added for clarity, but is not necessary. Kelvin will expand the range into a list of values that begins with the start value, increases by the increment, and ends with the last value that is no greater than the end value. For example, a range that runs from -1 to 1 in increments of 0.1 would be specified as '-1 - 1 : 0.1
'. A range specified as '0-1:0.15
' would expand to 0, 0.15 ... 0.75, 0.90
.Kelvin configuration files can be quite simple. Here's a file for a multipoint analysis that will generate PPLs for a dichotomous trait at one centiMorgan intervals, considering 4 markers around each trait position:
1 2 3 4 5 6 7 8 9 10 11 12 13 |
|
Lines 3 and 4 specify a 4-marker multipoint analysis, and that the trait should be placed at 1 cM intervals, starting at 0 cM and running through the end of the map. Line 6 specifies the values that will appear in the affection status column in the pedigree file. Lines 8-13 specify input and output files.
Here's a file for two-point analysis that will generate PPLs and linkage disequilibrium statistics between a dichotomous trait and each marker in the dataset:
1 2 3 4 5 6 7 8 9 10 11 12 13 14 |
|
Two-point analysis is the default, so no directive is necessary to specify that. Line 3 specifies a linkage disequilibrium analysis. Line 5 specifies the values that will appear in the affection status field of the pedigree. Lines 7-13 specify input and output files. Line 14 specifies that additional information should be written to the MOD file specified on line 13.
Kelvin directives are intended to be fairly self-descriptive.
In this reference, the following conventions are used to describe the valid arguments to directives.
roman type
are literals. If they are to be used in the configuration file, they must appear as written in this reference. For example,QT ChiSq
ChiSq
is a legal argument to the QT
directive.<angle brackets>
are symbolic, and refer to one of the four types of arguments, described above. If they are to be used in the configuration file, they must conform to the description of the respective argument type. For example,PedigreeFile <filename>
PedigreeFile
directive requires a single argument that conforms to the filename type.MarkerToMarker [ All | Adjacent ]
MarkerToMarker
directive can take either of the literals All
or Adjacent
as arguments.QTT Normal { <mean>, <standarddeviation> }
QTT
directive with the Normal
argument optionally accepts two additional arguments, <mean>
and <standarddeviation>
. In this example,TraitPositions [ Marker | <value> | <range> ] {, ... }
TraitPositions
directive can take either a literal Marker
, a single <value>
or a <range>
, and furthermore can take one or more of any of the preceding, after a separating comma.PedigreeFile <filename>
pedfile.dat
if not specified.LocusFile <filename>
datafile.dat
if not specified.FrequencyFile <filename>
markers.dat
if not specified.MapFile <filename>
mapfile.dat
if not specified.BayesRatioFile <filename>
br.out
if not specified.PPLFile <filename>
ppl.out
if not specified.MODFile <filename>
ExtraMODs
ProgressLevel <number>
The default is 1. Use this in conjunction with the ProgressDelaySeconds directive to fine-tune the flow of progress information to suit your needs.
ProgressDelaySeconds <number>
CTRL-\
....running...
will be displayed.MultiPoint <count>
<count>
adjacent markers, centered on each trait location, will be used to calculate the PPL. This directive is incompatible with the MarkerToMarker and LD directives.LD
MarkerToMarker [ All | Adjacent ]
All
is specified, all possible pairs of markers in the dataset are considered. If Adjacent
is specified, only adjacent pairs of markers are considered. This directive is incompatible with the MultiPoint directive.QT Normal { <mean>, <standarddeviation> }
QT ChiSq
Normal
is specified, the sample <mean>
and <standarddeviation>
values may be provided. Otherwise, a sample mean and standard deviation will be calculated using the trait values in the pedigree file. This directive is incompatible with the MarkerToMarker directive.QTT Normal { <mean>, <standarddeviation> }
QTT ChiSq
Threshold [ <min>, <max> | Fixed <value> ]
<min>
and <max>
should be values. This directive requires the QTT directive.LiabilityClasses <number>
Epistasis <markername>
<markername>
. Marker genotype data for the selected marker must be provided using the EpistasisPedigreeFile and EpistasisLocusFile directives. If the marker also appears in the dataset to be analyzed, it will be dropped. This directive is incompatible with the LiabilityClasses directive.EpistasisPedigreeFile <filename>
EpistasisLocusFile <filename>
EpistasisFrequencyFile <filename>
PhenoCodes <unknown>
PhenoCodes <unknown>, <unaffected>, <affected>
TraitPositions [ Marker | <value> | <range> ] {, ... }
Marker
indicates that PPLs should be calculated at each marker position, in addition to any other positions. If a range is specified, the end value of the range may be specified as the literal end
. In that case, Kelvin will expand the range so the last value is just past the last marker in the map. The literal end
in a range specification is only valid with this directive. This directive requires the MultiPoint directive, and is thus implicitly incompatible with the MarkerToMarker and LD directives.SexLinked
SexSpecific
Imprinting
PolynomialScale <scale>
<scale>
is an integer between 1 and 10. This directive is incompatible with the NonPolynomial directive.NonPolynomial
TraitPrevalence <value>
SkipPedCount
SkipCountWeighting
SkipEstimation
SkipAnalysis
DryRun
CountFile <filename>
ForceBRFile
AllowKosambiMap
NIDetailFile <filename>
SurfaceFile <filename>
SurfacesPath <dirname>
QTMeanMode [ Vary | Same | Fixed <value> ]
Same
specifies that the mean can only vary between traits, whereas Vary
allows it to vary between genotypes as well; the range of possible values is constrained using the Mean directive below. Fixed
specifies a fixed mean. The default is Vary
. This directive requires either the QT or QTT directives with the Normal
distribution. This directive is incompatible with the MarkerToMarker directive.QTStandardDevMode [ Vary | Same | Fixed <value> ]
Same
specifies that the standard deviation can only vary between traits, whereas Vary
allows it to vary between genotypes as well; the range of possible values is constrained using the StandardDev directive below. Fixed
specifies a fixed standard deviation. The default is Same
. This directive requires either the QT or QTT directives with the Normal
distribution. This directive is incompatible with the MarkerToMarker directive.Mean <min>, <max>
<min>
and <max>
should be specified as values. The defaults are min=-3, max=3. This directive requires either the QT or QTT directives with the Normal
distribution. This directive is specified differently (and is required) when using FixedModels with a quantitative trait; see Mean below. This directive is incompatible with the MarkerToMarker directive.StandardDev <min>, <max>
<min>
and <max>
should be specified as values. The defaults are min=0.25, max=2.5. This directive requires either the QT or QTT directives with the Normal
distribution. This directive is specified differently (and is required) when using FixedModels with a quantitative trait; see StandardDev below. This directive is incompatible with the MarkerToMarker directive.DegreesOfFreedom <min>, <max>
<min>
and <max>
should be specified as values. This directive requires either the QT or QTT directives with the ChiSq
distribution. This directive is specified differently (and is required) when using FixedModels with a quantitative trait; see DegreesOfFreedom below. This directive is incompatible with the MarkerToMarker directive. This directive is normally disabled for distribution versions of Kelvin.Truncate <min>, <max>
<min>
and <max>
should be specified as values. This directive requires either the QT or QTT directives, and is incompatible with the MarkerToMarker directive.Kelvin can be configured to perform calculations based on fixed grids of parameter values. This option should be used with extreme caution for purposes of computing PPLs because it is not compatible with Kelvin's underlying numerical integration routines and can return erroneous BRs. However, it can be useful in cases where, for instance, a very fine grid of values is wanted for purposes of increasing precision of the MOD and/or maximizing values, or for calculation of fixed-model LODs. It can also be configured to reproduce the "fixed-grid" numerical integration routines used by older versions of Kelvin for purposes of comparison with results generated under those versions. The FixedModels directive enables this behavior, and is required for any advanced directives that fix points of the trait model.
FixedModels
Constraint Penetrance [ [ DD | Dd | dD | dd ] { <class> } [ == | != | > | >= ] [ DD | Dd | dD | dd ] { <class> } ] {, ... }
Constraint Mean [ [ DD | Dd | dD | dd ] { <class> } [ == | != | > | >= ] [ DD | Dd | dD | dd ] { <class> } ] {, ... }
Constraint StandardDev [ [ DD | Dd | dD | dd ] { <class> } [ == | != | > | >= ] [ DD | Dd | dD | dd ] { <class> } ] {, ... }
Constraint DegreesOfFreedom [ [ DD | Dd | dD | dd ] { <class> } [ == | != | > | >= ] [ DD | Dd | dD | dd ] { <class> } ] {, ... }
Constraint Threshold [ <class> [ == | != | > | >= ] <class> ] {, ... }
Alpha [ <value> | <range> ] {, ... }
DiseaseGeneFrequency [ <value> | <range> ] {, ... }
DPrime [ <value> | <range> ] {, ... }
Penetrance [ DD | Dd | dD | dd ] [ <value> | <range> ] {, ... }
DD
, Dd
, dD
or dd
), when conducting a dichotomous trait analysis. This directive requires the FixedModels directive, and is incompatible with the QT, QTT and MarkerToMarker directives. The dD
genotype is only legal if the Imprinting directive is also specified.Theta [ <value> | <range> ] {, ... }
Threshold [ <value> | <range> ] {, ... }
Mean [ DD | Dd | dD | dd ] [ <value> | <range> ] {, ... }
DD
, Dd
, dD
or dd
). This directive requires either the QT or QTT directives with the Normal
distribution. The dD
genotype is only legal if the Imprinting directive is also specified. This directive is incompatible with the MarkerToMarker directive.StandardDev [ <value> | <range> ] {, ... }
Normal
distribution. This directive is incompatible with the MarkerToMarker directive.DegreesOfFreedom [ DD | Dd | dD | dd ] [ <value> | <range> ] {, ... }
DD
, Dd
, dD
or dd
). This directive requires either the QT or QTT directives with the ChiSq
distribution. The dD
genotype is only legal if the Imprinting directive is also specified. This directive is incompatible with the MarkerToMarker directive. This directive is normally disabled for distribution versions of Kelvin.DiseaseAlleles <number>
MaxIterations <number>
<number>
iterations.Study <label> [ client | server ] <dbhost> <dbusername> <dbpassword> <pedids include regex> <pedids exclude regex> { MCMC <total samples> <start of sample ids> <end of sample ids> }
The Diagnostics directives are all for debugging purposes. They're separated by concern, and all take an integer indicating a "diagnostic level" that determines what level of detailed debugging output for that class of diagnostics should be shown.
Diagnostics directives have no effect for distribution versions of Kelvin.
DiagOVERALL <value>
DiagLIKELIHOOD <value>
DiagREAD_PEDFILE <value>
DiagALLELE_SET_RECODING <value>
DiagGENOTYPE_ELIMINATION <value>
DiagPARENTAL_PAIR <value>
DiagCONFIG <value>
DiagINPUTFILE <value>
DiagXM <value>
DiagDCUHRE <value>
DiagPOLYNOMIAL <value>
DiagALTLSERVER <value>
DiagMAX_DIAG_FACILITY <value>