MySQL unique index questions

TheDude05

Limp Gawd
Joined
Jan 27, 2005
Messages
393
1. If I have two columns in the same unique index, does that mean mysql checks the unigue-ness based on both columns or individually?

2. I have a table with 9 columns and I want every row to be unique to one another (excluding the primary of course). This database logs our long distance phone calls so we can change departments accordingly. Obviously I dont want the same call being inserted more than once.

So my question is, do I put all these columns into a unique index (granted that i'm correct on question 1) or is there a better, more efficient way?
 
You can always run thru a loop and check to see if any of the rows match any of the info your trying to input before inserting. If its a match then you know its already there.
 
You should let the database check for duplicates; it's should use a very good algorithm to do so, and has better data locality than your client app. Creating a unique index can improve query performance and also puts the data integrity burden on the database, where it belongs.

1) both columns together. If you have two columns, {1,1} and {1,1} would be duplicates, but {3,1} and {1,3} aren't duplicates of {1,1} (or eachother).

2) You should find a primary key for this table. A primary key is what identifies a unique row. I don't know what columns you have in the table, but I would guess that the calling number and extension plus the time of the start of the call would be enough to make it unique. If your phone system is such that the same extension can initiate two calls at the same time, then you'll need to add something else.
 
Heres what my table looks like.

Code:
mysql> select * from statements limit 10;
+----+-----------+----------+-----------+------------+------------+---------+--------+---------+
| id | nprojcode | trunk_id | number_id | callgrp_id | call_date  | minutes | amount | special |
+----+-----------+----------+-----------+------------+------------+---------+--------+---------+
|  1 |      1098 |       29 |         3 |          8 | 1117162800 |     3.0 |   0.75 |         |
|  2 |      1098 |       29 |         4 |          1 | 1117163040 |     0.3 |   0.01 |         |
|  3 |      1098 |       29 |         5 |          1 | 1117163100 |     0.3 |   0.01 |         |
|  4 |      1098 |       29 |         6 |          1 | 1117163640 |     2.1 |   0.10 |         |
|  5 |      1098 |       29 |         7 |          2 | 1117163820 |     1.1 |   0.05 |         |
|  6 |      1098 |       29 |         8 |          1 | 1117164120 |     3.1 |   0.15 |         |
|  7 |      1098 |       29 |         6 |          1 | 1117204980 |     0.4 |   0.02 |         |
|  8 |      1098 |       29 |         9 |          1 | 1117204980 |     0.3 |   0.01 |         |
|  9 |      1098 |       29 |         9 |          1 | 1117205040 |     0.6 |   0.03 |         |
| 10 |      1098 |       29 |        10 |          3 | 1117205160 |     0.3 |   0.01 |         |
+----+-----------+----------+-----------+------------+------------+---------+--------+---------+
10 rows in set (0.00 sec)

Right now it has approx. 80,000 rows so I dont want to do a check on every value. I just thought a unique index that encompasses a few of the key columns would keep out duplicates. Thats what i'm probably going to do then
 
TheDude05 said:
Heres what my table looks like.
That isn't at all useful. To know which columns should be unique, we need to know what columns you have -- that is, what they really mean, what they really represent.

TheDude05 said:
I just thought a unique index that encompasses a few of the key columns would keep out duplicates. Thats what i'm probably going to do then
Indeed, you should use a unique index. (And maybe some additional non-unique indexes.) The trick is being smart about choosing which columns are in that index; you don't want to screw up your data model.
 
I thought I did but guess not?

I'm making a unique index with the trunk_id and date columns in it because you can never physically have the same trunk line being used at the exact same time. Therefore if they are in a unique index a value that matches that (aka the same call just being re-inserted) should never be injected into the database.

Am I still missing a piece to this?
 
TheDude05 said:
I'm making a unique index with the trunk_id and date columns in it because you can never physically have the same trunk line being used at the exact same time.
Your sample data shows such a case; id = 8 and id = 7. These appaear to be different calls as the duration is different.

Using a date column in a uinque index is tricky, as you need to know both the resolution for time your database supports and the resolution the system inserting the data will offer. If a call lasts shorter than the miniumum of those two resolutions, it will appear to be duplicate.
 
mikeblas said:
Your sample data shows such a case; id = 8 and id = 7. These appaear to be different calls as the duration is different.

hmmm... nice find. While technically I was correct, my data isn't. The data I'm supplied from the telephone company doesnt give seconds, just minutes. So those two calls were initiated within a minutes time, thus giving the same unix time stamp. Damn... :(

Using a date column in a uinque index is tricky, as you need to know both the resolution for time your database supports and the resolution the system inserting the data will offer. If a call lasts shorter than the miniumum of those two resolutions, it will appear to be duplicate.

Thats what is happening I guess. I use php's strtotime() function to create the timestamp but since I have it down to every minute, I can have timestamps that appear to be the same time. So basically I cant create a truely unique index because, while unlikely, it is possible for the same person to make a call to the same number, get the same trunk, talk for the same amount of time, all within a minutes time.
 
A brief re-statement of the problem:
You already have a primary key on the table, id. It is (obviously) unique, and can be used to identify a specific row.
You are after some way of recognising rows that have duplicate data in the remaining columns.

Can you do something like:
(forgive me if this doesn't work - I'm a SQLServer person, and I don't know how portable these examples are... oh, and they've not been tested...)

Code:
SELECT Min(Id), COUNT(*)
FROM statements
GROUP BY
nprojcode, trunk_id, number_id, callgrp_id, call_date, minutes, amount, special
HAVING COUNT(*) > 1
in mysql?
Failing that, would
Code:
SELECT MinId, RowCount
FROM
(
SELECT MinID = Min(Id), RowCount = COUNT(*)
FROM statements
GROUP BY
nprojcode, trunk_id, number_id, callgrp_id, call_date, minutes, amount, special
) AS DupeCount
WHERE RowCount > 1
work to retrieve a row that has a duplicate?

or have I just grabbed the wrong end of a sticky stick?
 
Back
Top