Input

The input form of CLARE is divided in three sections, in different boxes. The section “Regulatory elements with shared activity” allows the user to paste DNA sequences for regulatory elements in FASTA format. The section “Control sequences” specifies the background sequences. By clicking on ‘copy and paste control sequences’ the user can provide his/her own sequences in FASTA format as controls. By default, CLARE will extract a random set of noncoding sequences from the human genome, with lengths and GC-content adjusted to match those from the regulatory elements. Additionally, the user has the option to enter a locus for regulatory element prediction in FASTA format (“Sequence(s) for prediction of regulatory elements”). In all cases, the total sequence size may not exceed 2 Mb. The field “return to a previously submitted job” allows the user to use a job ID to display results stored on our Web Server.



After checking the input form for errors, CLARE places submissions in a job queue. Each job is assigned a job identity number (job or request ID), which the user may use to check the status, and later the outcome, of his/her job.



Waiting times depend on the number of jobs in the queue. Normally, your job should be processed promptly. A successful CLARE run may take from several minutes up to hours, depending on the number and length of your enhancer sequences. Typical execution time for training a model with 200 regulatory elements of average length 230 base pairs is between 5 and 10 minutes. Therefore we recommend using CLARE for sequence-sets with up to a few hundred sequences. After completing a job, CLARE will automatically display a results page.





Output

After completing, CLARE will display a results page, divided into several sections:

• Request ID:




This 16-digit number can be used to retrieve and display previous results stored on our web server.

• Characteristics of the positive (signal) set:




• De novo predicted motifs:

Sequence logos for motifs overrepresented among the sequences in the signal set. A text file with the corresponding position weight matrices can be downloaded by clicking on “matrices for de novo motifs”.




• Feature weights:

CLARE produces a table and a graph displaying relevant features as well as their corresponding weights. Additionally, CLARE shows the location of relevant TRANSFAC and JASPAR motifs in the signal set, and their frequency in the signal set, as compared with the control set.




• Model accuracy:




• Locus scan:

If the user has entered a collection of loci for prediction, CLARE will produce a text file (“List of positively scoring elements”) including all 150 base pair windows with positive scores, and a graph showing the scores for all 150 base pair-long windows, after the input loci have been concatenated together.




• Datasets:

Finally, CLARE includes links to the signal and control sequences in FASTA format, the loci entered for prediction (if there was any), and a text file with information concerning the model.