^CSE2304^

## Tute 3, CSE2304, CSSE, Monash, Semester 1, 2004

### Weeks 7 and 8, 19 to 30 April 2004

**Class**:
Prepare your answers to the questions *before* the tute!
It will probably not be possible to cover all questions unless
the class has prepared them all in advance.
(The online versions of the tutes may contain extra information and links.)

**Tutors**:
(i) The purpose of the tutorials is not to solve the prac's!
(ii) The purpose of the tutorials is to check answers, and
to discuss particular sticking points, not to simply make answers available.

- Given a sequence,
x
_{1}, x_{2}, `...`, x_{n},
a *subsequence* is some (or all) of its elements, in the same order,
e.g.
x_{2}, x_{4}, x_{5}, x_{8}, say.

A *common subsequence* of two sequences
x_{1}, x_{2}, `...`, x_{n},
and
y_{1}, y_{2}, `...`, y_{m},
is a sequence,
z_{1}, z_{2}, `...`, z_{k},
that is a subsequence of the x's and also of the y's.

A *longest common subsequence* is a common subsequence
that is as long as possible.

Give C-code to find the length of a longest common subsequence:
`int LCS(int n, t x[], int m, t y[])`.

Note that the more similar the x-sequence and the y-sequence are,
the longer is their LCS.

- What is the time- and space-complexity of
your answer to the previous question (best, average and worst cases)?

- How many times are "base" and "body" executed for r(1000)?
void r(int n)
{ if( n > 1 )
{ body;
r(n/2);
}
else
base;
}

Why?

- Two
*intervals*, [a,b] and [c,d],
where a, b, c and d are floating-point numbers between 0.0 and 1.0 inclusive,
a<=b and c<=d, can overlap in the following ways:
1. [a__________b] 2. [a_______b]
[c__________d] [c_____d]
3. [a_________b] 4. [a___b]
[c__d] [c_____________d]

You are working in a scientific laboratory where a long-running experiment
is producing a sequence of intervals
[i_{1}, j_{1}],
[i_{2}, j_{2}], `...`,
[i_{n}, j_{n}].
For each interval,
[i_{m}, j_{m}], your colleagues need to know
which of the earlier intervals,
[i_{k}, j_{k}] k<m, if any,
overlap with [i_{m}, j_{m}].
When m becomes large you expect the number of earlier intervals
overlapping [i_{m}, j_{m}] to be
<<m (much less than m) in *most*,
but *not all*, cases.
Note that n can be very large indeed, in the millions or even more, and
you need an efficient solution.

Is the problem easy or hard? Why?
*Outline* a possible solution, i.e. algorithms and data-structures.
(Do not give C code.)
What is worst-case data for your solution? Why?

© L. Allison 2004,
School of Computer Science and Software Engineering,
Monash University, Australia 3800.
Created with "vi (Linux & Solaris)", charset=iso-8859-1