Thursday, March 29, 2012

CASE statement in a JOIN

Hi,
I have two tables TABLE_A and TABLE_B

TABLE_A has rows like this:
PROJECT_ID TASK_ID TASK_NAME
1 100 One Hundred
1 110 One Hundred Ten
1 120 One Hundred Twenty
2 200 Two Hundred
3 300 Three Hundred
3 310 Three Hundred Ten

TABLE_B has rows like this:
PROJECT_ID TASK_ID AMOUNT
1 100 1000
1 110 1100
2 NULL 2000
3 300 3000

I want to inner join TABLE_A and TABLE_B such that if TASK_ID is available in TABLE_B, then join should happen on TASK_ID (on TABLE_A.TASK_ID=TABLE_B.TASK_ID), if TASK_ID is not available the join should happen on PROJECT_ID.

For example for PROJECT_ID=2, there is no TASK_ID in TABLE_B (in this situation the join should be ON PROJECT_ID)

How can we do a CASE like situation here?

Thanks in advance
qADoes this work?

ON ft1.field1 = Coalesce(t2.field1)
OR ft1.field2 = Coalesce(t2.field2)|||I have never seen coalesce used with 1 argument.|||Oops... I have forgotten where I was coming from with that now...

Remove the coalesces and it might work..?
(*confused*)|||You can do it the way that georgev mentioned (without the coalesces), but reqardless of how you do this, you are almost surely going to have duplication in your result set. Any place that you have more than one record in TABLE_A with the same PROJECT_ID and record(s) in TABLE_B with that PROJECT_ID and a null TASK_ID, you are going to have duplication of the values in TABLE_B.|||I agree. If I use OR clause here, it would give lots of duplicates. Any other suggestions??|||on (TABLE_A.TASK_ID=TABLE_B.TASK_ID or ( (TABLE_A.TASK_ID<>TABLE_B.TASK_ID or TABLE_B.TASK_ID is null) and
TABLE_A.PROJECT_ID=TABLE_B.PROJECT_ID)
)
?|||look ma, no dupes:select TABLE_A.PROJECT_ID
, TABLE_A.TASK_ID
, TABLE_A.TASK_NAME
, TABLE_B.PROJECT_ID
, TABLE_B.TASK_ID
, TABLE_B.AMOUNT
from TABLE_A
inner
join TABLE_B
on TABLE_B.TASK_ID = TABLE_A.TASK_ID
union
select TABLE_A.PROJECT_ID
, TABLE_A.TASK_ID
, TABLE_A.TASK_NAME
, TABLE_B.PROJECT_ID
, TABLE_B.TASK_ID
, TABLE_B.AMOUNT
from TABLE_A
inner
join TABLE_B
on TABLE_B.PROJECT_ID = TABLE_A.PROJECT_ID
where not exists
( select 937
from TABLE_B
where TASK_ID = TABLE_A.TASK_ID )|||select TABLE_A.PROJECT_ID
, TABLE_A.TASK_ID
, TABLE_A.TASK_NAME
, TABLE_B.PROJECT_ID
, TABLE_B.TASK_ID
, TABLE_B.AMOUNT
from TABLE_A
inner
join TABLE_B
on TABLE_B.TASK_ID = TABLE_A.TASK_ID
union
select TABLE_A.PROJECT_ID
, TABLE_A.TASK_ID
, TABLE_A.TASK_NAME
, TABLE_B.PROJECT_ID
, TABLE_B.TASK_ID
, TABLE_B.AMOUNT
from TABLE_A
inner
join TABLE_B
on TABLE_B.PROJECT_ID = TABLE_A.PROJECT_ID
where not exists
( select *
from TABLE_B
where TASK_ID = TABLE_A.TASK_ID )|||Is it me, or is that identical to r937's post, except you changed the 937 to an asterix..?|||i love it when my answers are secretly adopted by people :)

by the way, it's "asterisk" -- asterix (http://en.wikipedia.org/wiki/Asterix) was a french comic book character

:)|||That might be why I spell it that way then ;)
Would I be correct in assuming that

IF EXISTS(SELECT 1 FROM MyTable)

Is more efficient than

IF EXISTS(SELECT * FROM MyTable)

Assuming that there is more than one column in the table?|||SELECT with EXISTS was optimized in SQL 2000 (I believe). It does not really pull back all of the columns as was the case I believe in SQL 7. I stopped eradicating it from code a while ago.|||Well a large portion of my work is still on 6.5 :p
I assumed that it would only ever need to return a single row - but since 6.5 doesn't have the TOP function, I assumed it returned the entire resultset.

Lots of assumptions, eh?|||I believe exists exits once it finds a nonqualifying row while NOT EXISTS will scan the whole result set.|||It seems so simple now! :p|||...while NOT EXISTS will scan the whole result set.actually, in this case it would likely be an index seek (or whatever it's called)

:)|||The reason that I stated that you will likely have duplicates regardless is because even in R937's code you could have duplicates in certain cases. While in your specific example you wouldn't have any duplicates with that code, you might have duplicates if there are any cases where you have any records on TABLE_B with a null TASK_ID and a PROJECT_ID that matches multiple records in TABLE_A or if you have multiple records in TABLE_B with a null TASK_ID. PROJECT_ID is evidently not a primary key in TABLE_A, whereas from your example, TASK_ID looks like it might be. For instance take the example below:

TABLE_A has rows like this:
PROJECT_ID TASK_ID TASK_NAME
1 100 One Hundred
1 110 One Hundred Ten
1 120 One Hundred Twenty
2 200 Two Hundred
3 300 Three Hundred
3 310 Three Hundred Ten
3 320 Three Hundred Twenty

TABLE_B has rows like this:
PROJECT_ID TASK_ID AMOUNT
1 100 1000
1 110 1100
2 NULL 2000
3 300 3000
3 NULL 4000
3 NULL 4100

Your result set will be:
PROJECT_ID TASK_ID TASK_NAME PROJECT_ID AMOUNT
1 100 One Hundred 100 1000
1 110 One Hundred Ten 110 1100
2 200 Two Hundred 200 2000
3 300 Three Hundred 300 3000
3 310 Three Hundred Ten NULL 4000
3 320 Three Hundred Twenty NULL 4000
3 310 Three Hundred Ten NULL 4100
3 320 Three Hundred Twenty NULL 4100

If you don't have any of that type of issue and are certain that you won't in the future, or you actually want that type of result in that type of instance, then no worries. Otherwise you might have a problem.

No comments:

Post a Comment