Introduction to SeqLab
For instructions on configuring your computer and starting Seqlab, please see the GCG FAQ.
Begin by starting the Seqlab GUI. You should see the following:

Click on the OK button to close the About SeqLab window. Note that this window informs you of the citation you should give to GCG if you publish any papers that utilize GCG programs.
Main List versus Editor
SeqLab is GCG's newest graphical user interface (GUI). Seqlab is an amalgamation of the older GUI called WPI and the sophisticated multiple sequence alignment editor that was developed by the Ribosomal Database Project (RDP). The older WPI portion of Seqlab is now known as the Main List and contains over 130 programs for analyzing RNA, DNA and Protein sequences. The WPI interface reviously contained a program called seqed that was used to enter sequence information. That portion of the program has been removed and now all sequence entry is handled by the Editor portion of the program. The Main List and Editor portions of the program are linked, through a mode button, so that you can move between them to accomplish various tasks.
The main list is set up so that you can efficiently organize and execute your work. GCG assumes that you will have separate projects with related sequences. Thus, when you choose the file new or file open options from the Main window shown below, you will be creating or opening a project file. These project files are often referred to as working lists. Options present in the main list menus make it easy to add sequences from remote databases or local directories to each working list or to remove unwanted sequences. You should note that this organization means that if you want to analyze a sequence and it is not already in a working list, you must 1) add that sequence to an existing working list, or 2) create a new working list and then add the sequence to that list. For one sequence this is slightly cumbersome, but this inconvenience is more than compensated for by the ease at which multiple sequences can be handled.
Once you have closed the About SeqLab window you will see the following blank editor window, provided you have never created a working list before.

If, however, you ever used WPI under version 8 of GCG, when you start Seqlab, you will first see the Main List instead of the editor window (shown below). It is simple to switch back and forth. All you have to do is click on the mode button and select either Main List or Editor from the pull down menu. Note: If you have previously created a working list, the program will automatically default to the last working list you used.
If your screen came up in editor mode, you should now select the Main List option from the mode pull down menu.

Next create your own working list by:
- Selecting on the File menu and choosing the New option.
- Name the file whatever you wish and ending with .list, e.g. temp.list and click on the OK button (see the example on the next page).

Next add sequences according to the following directions.
Adding Sequences to The Working List In Both Editor and Main List (Note Different Extensions)
To add sequences to the main list, first go to the file menu and then to the Add Sequences From choice (screenshot below). You can then add sequences to the working list from your own files or from the databases. Note that sequences in the main list are just pointers to the actual files. This means that the files entered from genbank or other remote sites are not actually copied to your file space. Instead they will be copied into memory from the database whenever they are needed.

For this course, you will choose the Database option. The screen on the next page should appear which shows the available databases. You should select the Bacterial section of genbank (see the example on the next page). You can also search for the sequences you wish to consider loading by using letters common in the names of the sequences you are interested in. Alternatively, you can search by accession numbers. For example, all the sequences you will be using today have nifh in their name. You will therefore click in the database specification box and type in *nifh* after bacterial: and click on the Show Matching Entries button at the top of the page (again see diagram on the next page). Make sure you delete or type over the default specification (in this case bacterial:*). Also remember that often, in Seqlab, the backspace is a destructive delete, i.e. the backspace key will delete the character the cursor is on or to the right of the cursor, as opposed to a backspace, which will delete the character to the left of the cursor.

Generally in thirty seconds or less the sequences will appear in the Show Matching Entries list (screenshot follows) and you can select these just as you would with a word processor program. To select individual sequences hold down the control key and click on the files you wish to add to the working list. To select a block of sequences, click on the top member of the block then hold down once the shift key and click the bottom member of the group. Everything in between will be selected. When you have finished selecting sequences, you should click on the Add to Main Window at the bottom of the window and the sequences will be added to the working group.

For this exercise we would like for you to select the following sequences: gb_ba:ukbnifh7, gb_ba:ukbnifh23, and gb_ba:clonifh. Using the methods described in the paragraph above, please add these sequences to your new working list. After clicking on the first sequence you wish to select, be sure to hold down the control key before selecting each additional entry. Next you will add a forth sequence using it’s accession number. In this case click on the database specification field, highlight the current sequence entry and type in the following accession number L00688 (these are zeros, not Os). The Database Specification field should read Bacterial:L00688 (see the picture on the next page). Again click on Show Matching Entries and when the sequence is retrieved click on Add to Main List.

Note: To add sequences while in the editor mode you use essentially the same procedure. You go to the file menu item and then go to the add sequences from item. Note that you have three choices here: Databases, sequence files, and main list. You can pull in sequences from any of these. You also have a separate menu choice of a new sequence. The new sequence choice lets you enter in sequences by hand or cut and paste from other applications. This choice allows the editor to replace the old seqed program.
When you are done your Main List window should look like the following:

Understanding Which Sequences Are Local and Non-local
When you add a sequence to the main list, you are merely adding a pointer to the file where the actual sequence lives. The actual sequence can live in your own directory space or live in the databases. Sequences that are created or modified in the sequence editor are different. In order to edit a sequence, the editor has to load the sequence from whatever the normal source is, generally a database file. Then once the loaded file is modified, it is written in your local directory in order to preserve the changes you made without modifying the database file. The new file is stored in what is known as an rsf format. The modified sequence can then be added to a working list.
Adjusting Properties of the Sequence
The next portion of this exercise will be to demonstrate how to modify the properties of the sequences contained in your working list. Specifically, in the main list window double click on the clonifH sequence. (This particular double-click requires a certain speed between clicks that can take awhile to learn). If you click at the correct speed you will see the sequence expand and show the following screen.

You will see a reference section which contain information about the sequence such as where it was published, the length of the sequence, the boundaries for the coding and non-coding regions, etc. You can expand this box by dragging down the little box that is located on the right hand side of the line which is just below the reference box. You can then use the scroll bar on the right hand side to move up and down to read the information. You should note that the clonifh sequence contains several coding regions and that the coding sequence for the actual nifH gene is from base pair 2623 to 3444. If we do not select this region, then some of the subsequent analyses we will be using will not work well because of the length of clonifH relative to the other sequences. Therefore, what we would like to do now is select this region. Note you will want to minimize the reference window before proceeding in order to see more of the sequence at one time. At this point, you have one of two choices on how to select the sequence. The first method consists of scrolling down to around bp 2600 and highlighting base pairs 2623 to 3444 just as you would highlight text using a word processor. Once the text is selected you click on the apply button at the bottom of the screen. This will set the sequence so that only base pairs 2623 to 3444 will be used in any analysis. A second simpler way is to type 2623 in the begin box and 3444 in the end box at the top of the screen and then hit the apply button.

Note that in this window, you can also adjust other properties of the sequence. You can choose forward or reverse strand, whether it is linear or circular, you can give it a weight that will affect certain programs, and you can select parts of the sequence GCG programs should be concerned with (e.g. only the coding regions). To choose sections of the sequence you can either type the beginning or ending base number in the begin and end box or select the section from the sequence using your mouse.
Sequence Editor
The next exercise will be to demonstrate how to enter new sequence information and how to copy and paste additional sequence information from existing sequences into the new sequence you are creating. To accomplish this task we will utilize the sequence editor portion of Seqlab. The editor program was designed originally for doing multiple sequence alignments as part of the ribosomal database project. It is one of the best editors available for this purpose. Unfortunately, it has a few idiosyncrasies which are a little confusing at first. However, once you understand what is going on you can enter and modify sequences with minimal effort.
Note: the menus and options for the sequence editor are similar to those in the main list portion of the program. The primary difference is that the editing menu contains the necessary tools for editing sequences. You will be able to run the standard analysis programs from either the main list or sequence editor modes.
Creating A New Sequence
Before you proceed to creating a new sequence we would like for you to select all four nifH sequences in the main list window. Once these are selected go to the Mode button and select Editor from the drop down menu. This will carry all four sequences used in the main list over to the working group under the editor. We are doing this so that you will have sequences from which you can copy information as part of the exercise and to demonstrate that only highlighted sequences in the main list are automatically loaded into the sequence editor portion of the program when one switched from the main list to sequence editor mode.
If you have selected all four sequences and switched to editor mode successfully you will see the following screen:

To enter a new sequence, you next choose new sequence from the file menu item in the editor. You will be prompted for the type of sequence: DNA, RNA , Peptide, etc. Select DNA. The editor will then look like this:

You can then start typing in bases. Note that both the tilde and the space act as gaps. Once you have entered 50 to 100 bases you can click on the i INFO icon in the middle of the page (see the following picture). A window (screenshot below) will open up and you can fill in the appropriate information about the new sequence you are creating including the name you wish it to have. Once the information has been entered hit OK. You should also note that at any point you wish to add additional sequences to this group you can add other sequences to this working list using the “File, Add Sequence From” pull down menu option.
To change the name of the new sequence to something meaningful or to enter other relevant information about the sequence, click on the i button or select the Sequence info item from the file menu. The following window will appear. Note also that you can enter or change other information such as accession number, author, type of sequence, etc.

Copying And Pasting
At this point we will demonstrate how additional sequence information can be added to the new sequence by copying all or part of existing sequences and pasting that sequence information into the new sequence. The primary reason for doing this is to demonstrate the fastest way to create a complete sequence from several different sources as would be required if you were combining regions from several different plasmids. In each of these cases, the most efficient method is to include all the sequences from which DNA sequences or will be removed in the same working group. Then any necessary sequences can be copied into the new sequence in the appropriate order.
In this example you should highlight one of the original four nifH sequences. Then you must get the segment you want to insert. To do this, go to the edit menu and choose Select range. In the select range box choose the beginning and end points of the segment you want. Choose the Select range box. Then making sure that the sequence is selected, press the copy or (cut button) from the edit menu.
At this point, or when you go to paste into the sequence, you may run into the following error message.

This results from the default protections that come with the Editor and are designed to protect the novice from deleting sections from sequences inadvertently. To correct this problem, first select all the sequences from which you want to remove the protections, then either go into the protections option under the file menu or click on the Lock icon (screenshot follows). Select all the files from which you wish to remove the protection and click all the options so that they look like this and you should be ready to go.

Once the permissions have been changed if necessary, then select the sequence you want to insert the segment into, in this case the new sequence you have been constructing. (Again, make sure the protections are set to allow pasting). Choose the insertion point either with the mouse or by using the select range option. Then press the paste button. The sequence will drop in where you indicated. You should check this to make sure the insertion went as planned. If desired, repeat the cut and paste procedure to help you remember how to repeat the process in the future.
Cutting
Now you should practice cutting out some of the sequence you have just created. To do this, first click on the sequence you have just created. To remove part of a sequence you can highlight the section if it is not very long and click on the Cut button. Again, the more efficient way to do this is to click on the edit button and select the range option. You can then specify the range you wish to cut, hit select, then hit the cut button. This should allow you to remove the segment.
Deleting A Sequence
Once you are done you can delete the sequence you have created by selecting the sequence and then choosing the cut button or choosing the cut option from the edit menu. That should leave you with the original four nifH sequences. An alternative method is to go to the Main List, select the sequence to delete, go to the Edit menu and choose the RemoveFrom List option.
Saving a Sequence or Group of Sequences
You will not save the new sequence we have just created, but how to do so is described in this section. Saving is not as obvious as you might like. To save a sequence, go to the file menu item and choose the Save as option. You will have to save the sequence (or sequences) into an rsf file, because that is the only way to retain the information from the editing session. Once the rsf file is created, it will appear in the main list, but the sequence itself will be hidden. Note, files in the main list that are pointers to one or more other files end with {*} characters. For example: test.rsf{*}, would contain one or more sequences. This is because the main list shows pointers to files and the rsf file is a file that contains the sequence(s). You can expand the rsf file to show what is in it by double clicking on it. Once expanded, the separate sequences in an rsf file displayed in the main list window can be used separately from the rest.
Running Programs Of Interest
You can run programs by going to the function menu item in either the main list or the editor (see figure below). GCG programs are grouped by function. To truly understand what each program does, it is best if you read the manual. Note that the version nine manual is organized alphabetically, not by function.

Understanding switches and options
Nearly all GCG programs have options. In the command line interface some of these options must be applied as switches on the same line as the program name. In SeqLab, all of the options appear as buttons, sliders, or boxes. To understand what the options do, you must read the Program Manual about that particular program.
Note: One bug in the previous version of the GUI is that some values adjusted by using sliders do not take effect unless you click in the corresponding value box.
For this exercise, please select the Mapplot program, by first selecting a sequence and then selecting MapPlot under the Mapping section of the Function menu. Once the program is running, click the Options button and spend a few minutes examining the options. Once you have selected the options such as the specific enzymes you wish to view, the maximal number of cuts that will be allowed, etc. (see below) then run the program by clicking Run. At this point you can go to the Windows button at the top of the screen and select Job Manager from the pull down menu. The section below describes the function of the Job Manger.

Job Manager
Once a program has started it runs in the background so you can work on other tasks. To check on the status of your program you can go to the Windows menu and select Job Manager. This will show you the status of your job. After looking at the Job Manager click on the Close Job Manager button located on the lower left part of the window.

Output Manager
To view the output from one of the programs go to the Windows menu and select Output Manager. Click on the analysis of interest. If the output is something like a multiple sequence alignment, the aligned files can be added back to the current working list by choosing the “Add to main list” option.
 |