finding duplicate records in the DB table, or the data trasnpose

Hello

I have a question...

Key | UID. Start Dt | End date. / / DESC


--------------------------------------------------------------------------------
1. 101 | March 12 09 | 30 May 09 | UID101

2. 101 | January 1 09 | February 25 09 | UID101

3. 102. 13 March 09 | 30 March 09 | UID102

4. 103. 13 March 09 | 30 March 09 | UID103

5. 103. 13 March 09 | April 1 09 | UID103

6. 104. 13 March 09 | 30 May 09 | UID104

7. 104. February 25 09 | 29 May 09 | UID104

8. 105. 15 February 09 | March 1 09 | UID105

9. 105. April 1 09 | 30 May 09 | UID105

The query must know UID in duplicate according to the above data, which are stored in the same form in a table. The definition of the UID duplicate is

(1) UID repeating themselves (records by 2) ex are 101,103,104 and 105.
(2) each UID has two dates and date of end of beginning.
(3) the UID for which dates are overlaping. For ex: touch #4, 103 UID whose start dates are March 13 09-30-Mar-09 and there also another record, with the # 5 UID 103 key dates are 13 Mar 09 to 1 April 09. Here, there is overlap or intersection in line #4 with key #5 key dates dates of rank. This UID is duplicated UID by def.

What precedes that falls under def and selectable are 103 and 104 only 102 UID has only a single line, UID 105 dates are mutually exclusive or not that overlap and even for the UID.

Is there a function available DB to make use of?


Wanted not to delete records or duplicate records.

There is a report to display these duplicate records.

It would be good for me if I can get the data transposed for UID

as

Of

4. 103. 13 March 09 | 30 March 09 | UID103

5. 103. 13 March 09 | April 1 09 | UID103

TO
UID. Start the t1d. End t1d. Start the T2D. End T2D
103: |13-Mar-09|30-Mar-091-Apr-09 13 March 09

Any advice or ideas can be useful to gr8

Thank you...

It can also be done without Analytics:

WITH test_data AS (
  SELECT  1 AS KEY, 101 AS UD, TO_DATE('03/12/2009','MM/DD/YYYY') AS START_DT, TO_DATE('05/30/2009','MM/DD/YYYY') AS END_DT, 'UD101' AS DSC FROM DUAL UNION ALL
  SELECT  2 AS KEY, 101 AS UD, TO_DATE('01/01/2009','MM/DD/YYYY') AS START_DT, TO_DATE('02/25/2009','MM/DD/YYYY') AS END_DT, 'UD101' AS DSC FROM DUAL UNION ALL
  SELECT  3 AS KEY, 102 AS UD, TO_DATE('03/13/2009','MM/DD/YYYY') AS START_DT, TO_DATE('03/30/2009','MM/DD/YYYY') AS END_DT, 'UD102' AS DSC FROM DUAL UNION ALL
  SELECT  4 AS KEY, 103 AS UD, TO_DATE('03/13/2009','MM/DD/YYYY') AS START_DT, TO_DATE('03/30/2009','MM/DD/YYYY') AS END_DT, 'UD103' AS DSC FROM DUAL UNION ALL
  SELECT  5 AS KEY, 103 AS UD, TO_DATE('03/13/2009','MM/DD/YYYY') AS START_DT, TO_DATE('04/01/2009','MM/DD/YYYY') AS END_DT, 'UD103' AS DSC FROM DUAL UNION ALL
  SELECT  6 AS KEY, 104 AS UD, TO_DATE('03/13/2009','MM/DD/YYYY') AS START_DT, TO_DATE('05/30/2009','MM/DD/YYYY') AS END_DT, 'UD104' AS DSC FROM DUAL UNION ALL
  SELECT  7 AS KEY, 104 AS UD, TO_DATE('02/25/2009','MM/DD/YYYY') AS START_DT, TO_DATE('05/29/2009','MM/DD/YYYY') AS END_DT, 'UD104' AS DSC FROM DUAL UNION ALL
  SELECT  8 AS KEY, 105 AS UD, TO_DATE('02/15/2009','MM/DD/YYYY') AS START_DT, TO_DATE('03/01/2009','MM/DD/YYYY') AS END_DT, 'UD105' AS DSC FROM DUAL UNION ALL
  SELECT  9 AS KEY, 105 AS UD, TO_DATE('04/01/2009','MM/DD/YYYY') AS START_DT, TO_DATE('05/30/2009','MM/DD/YYYY') AS END_DT, 'UD105' AS DSC FROM DUAL
)
select  t1.ud,
     t1.key, t1.start_dt, t1.end_dt,
     t2.key, t2.start_dt, t2.end_dt
from     test_data t1
,     test_data t2
where     t1.ud = t2.ud
  and     t1.key < t2.key
  and     ((t1.end_dt - t1.start_dt) + (t2.end_dt - t2.start_dt)) >
        (greatest(t1.end_dt, t2.end_dt) - least(t1.start_dt, t2.start_dt))
/

Result:

        UD        KEY START_DT   END_DT            KEY START_DT   END_DT
---------- ---------- ---------- ---------- ---------- ---------- ----------
       103          4 13-03-2009 30-03-2009          5 13-03-2009 01-04-2009
       104          6 13-03-2009 30-05-2009          7 25-02-2009 29-05-2009

In addition, you will need to adjust the date a little comparison, depending on whether you set two periods where the first End_date is equal to the start_date in the second, because duplication or not.

Published by: tijmen on December 21, 2009 06:17

Tags: Database

Similar Questions

  • Find duplicate records in the fields.

    Hi all

    There are 5 A B C D E of type Varchar fields in my table x. 4 A B C D fields are key fields.

    I would like to ask two copies of the fields in the fields A B C D.

    Please suggest me.

    Thank you
    KSG

    Hello
    Simply:

    SELECT A, B, C, D, Count(*)
      FROM your_table
     GROUP BY A, B, C, D HAVING Count(*) > 1;
    
  • Find records in the date range

    Hello

    I have the following data
    with t as (
       select 1234 prod_id, to_date('01-Jan-2008', 'dd-MON-yyyy') start_date, to_date('01-May-2012', 'dd-MON-yyyy') end_date
        from dual union 
        select 4567 prod_id, to_date('01-Aug-2007', 'dd-MON-yyyy') start_date, to_date('01-Apr-2012', 'dd-MON-yyyy') end_date
        from dual union
        select 8910 prod_id, to_date('01-Jul-2006', 'dd-MON-yyyy') start_date, to_date('01-Mar-2012', 'dd-MON-yyyy') end_date
        from dual 
        )      
        SELECT *
      FROM t
    What is the best way to find all records in April 1, 2012 and April 30, 2012?

    Concerning

    Assuming that you are looking for cumulation ranges

    SELECT *
      FROM t
     WHERE start_date BETWEEN date '2012-04-01' and date '2012-04-30'
        OR end_date BETWEEN date '2012-04-01' and date '2012-04-30'
        OR (    start_date < date '2012-04-01'
            AND end_date > date '2012-04-30' )
    

    who produces the two lines expected

    SQL> with t as (
      2     select 1234 prod_id, to_date('01-Jan-2008', 'dd-MON-yyyy') start_date, to_date('01-May-2012', 'dd-MON-yyyy') end_date
      3      from dual union
      4      select 4567 prod_id, to_date('01-Aug-2007', 'dd-MON-yyyy') start_date, to_date('01-Apr-2012', 'dd-MON-yyyy') end_date
      5      from dual union
      6      select 8910 prod_id, to_date('01-Jul-2006', 'dd-MON-yyyy') start_date, to_date('01-Mar-2012', 'dd-MON-yyyy') end_date
      7      from dual
      8      )
      9  SELECT *
     10    FROM t
     11   WHERE start_date BETWEEN date '2012-04-01' and date '2012-04-30'
     12      OR end_date BETWEEN date '2012-04-01' and date '2012-04-30'
     13      OR (    start_date < date '2012-04-01'
     14          AND end_date > date '2012-04-30' );
    
       PROD_ID START_DAT END_DATE
    ---------- --------- ---------
          1234 01-JAN-08 01-MAY-12
          4567 01-AUG-07 01-APR-12
    

    Justin

  • How to delete a record in the data store VMware

    Hello

    First VM4 & VM5 is migrated during the execution of the migration of the VM20, sort of migration of VM4 & VM5 failed a few minutes later. But VM4 folder could be seen under backup2 datastore. When VM4 is migrated again once finished VM20 migration, VM4_1 record can be seen in the data backup2 store.  And VM4 folder includes a DB_backup - flat hard file.

    vm4-2.jpg

    So I have a few questions:

    (1) how can I remove the VM4 because it's useless?  Should I delete file DB_backup - flat hard and then delete the folder VM4? But I can't find measures to remove the VM4 folder in the store of data backup2. Who can provide the steps?

    (2) how can I rename the folder VM4_1 to VM4 name?

    Thank you!

    Thanks Andre for clarification. Yes, even if the file can be renamed when the VM is turned on but this will lead to issues if we want to migrate the virtual computer to the another data store or if we stop & try to turn on. So please do not rename folder when the virtual machine is running.

    Second approach will work fine.  Renaming that is to display the name of the computer virtual, then SVMotion in another data store & and then you can bring the virtual machine to the previous data store.

    Reference:VMware KB: rename a virtual machine and VMware ESXi and ESX records

  • How to get the different records for the date max.

    Hi all

    Here is the sample sql for sample table and data.

    Create table student (dept_id number(10), first_name varchar2(10),last_name varchar2(10),join_date date,years_attended number(10));

    insert into student values (1,'Ann','Coleman',to_date('3/7/1917','MM/DD/YYYY'),4);
    insert into student values (1,'Ann','Coleman',to_date('3/7/1916','MM/DD/YYYY'),5);
    insert into student values (2,'Rock','Star',to_date('1/1/1920','MM/DD/YYYY'),5);
    insert into student values (2,'Rock','Star',to_date('1/1/1921','MM/DD/YYYY'),6);
    insert into student values (3,'Jack','Smith',to_date('7/1/1900','MM/DD/YYYY'),3);

    insert into student values (3,'Jack','Smith',to_date('7/1/1901','MM/DD/YYYY'),4);

    commit;

    I need to get maximum date records when the name and dep_id corresponds to. I wrote the query below and it becomes the expected result, but I'm not sure it's quite effective.

    SELECT s.dept_id, s.first_name,s.years_attended

    FROM (SELECT dept, MAX (join_date) join_date

    STUDENT GROUP BY dept_id) x

    Student JOIN s ON x.dept_id = s.dept_id AND x.join_date = s.join_date;

    This above query returns records like below, and this is the goal.

    DEPT_ID NAME YEARS_ATTENDED

    1                         Ann                                4

    2                         Rock                              6

    3                         Jack                              4

    Can you please let me know the query SQL I wrote is effective or not? This sample table may have less data, but I'm dealing with millions of records.

    Hello

    Thanks for posting CREATE TABLE and INSERT statement. This really helps.

    Here's a solution. I also added a name that seems logical. In which case you can delete:

    Select dept_id, first_name, last_name

    , max (years_attended) Dungeon years_attended (last dense_rank order by join_date)

    the student

    Group of dept_id, first_name, last_name;

    DEPT_ID FIRST_NAME LAST_NAME YEARS_ATTENDED

    ---------- ---------- ---------- --------------

    1 Ann Coleman 4

    Rock 2 star 6

    3 Jack Smith 4

    Kind regards.

    Alberto

  • Get number of records in the date range - IDE: PLSQL Developer

    I want to count the number of records of members who register within a date range specified, based on effective and expiration dates and their "elg_code". I posted the SQL code for some examples of data. What I would like to see returned is in three columns of the counties where the eff_date date_exp members fall within the date range specified by the SQL and have an Elg_code of ' ' (space).

    So what I would like is all members with elg_code ' ' where he eff_dt and exp_dt range is April 2012, 2012 MAY & JUN 2012. Thus, according to data from the sample I posted, Mark, where his elg_code record is ' ', his eff_dt is 01/01/2011 and April 2012 (30/04/2012) is his exp_dt. Range of the brand statement may 2012, but not MAY or JUNE of 2012. Marty would tally for the APR and MAY because his eff_dt is before MAY 2012 and his exp to MAY 2012. etc...

    According to the data below, the results should resemble:

    APR MAY JUN
    4-3-2

    APR should have FRANK, MARK, MARTY, MARY.
    MAY should have FRANK, MARTY, MARY
    JUN should have FRANK and MARIE

    NOAM and JOHN should not appear as his records with elg_code ' ' have no documents eff_dt and exp_dt which are April-June 2012.

    So what I tried without success as it appears that I have a kind of Cartesian question (?), is:

    Select count (m1.mbr_name) APR,
    Count (m2.mbr_name) MAY,
    Count (m3.mbr_name) JUN
    mbr2 M1,
    mbr2 m2,
    mbr2 m3
    "where m1.eff_dt < ' 01 - may - 2012"
    "and m1.exp_dt > ' 01-Apr-2012.
    and m1.elg_code = ' '
    "and m2.eff_dt < 1 June 2012"
    "and m2.exp_dt > ' 01 - may - 2012"
    and m2.elg_code = ' '
    "and m3.eff_dt < 1 July 2012"
    "and m3.exp_dt > 1 June 2012"
    and m3.elg_code = ' '


    Here's the DML

    Thanks for any help!


    create table mbr2 (mbr_name varchar (10), varchar (1) grpid eff_dt date, date of exp_dt elg_code varchar (1))
    Commit

    insert into mbr2 values ('BRAND', 'A', to_date (January 1, 2011 ',' DD-MM-YYYY '), to_date (April 30, 2012 ',' DD-MM-YYYY '), ' ')
    insert into mbr2 values ('BRAND', 'A', to_date (May 1, 2012 ',' DD-MM-YYYY '), to_date (31 December 2013 ',' DD-MM-YYYY '), 'C')

    insert into mbr2 values ('MARTY', 'A', to_date (January 1, 2011 ',' DD-MM-YYYY '), to_date (May 31, 2012 ',' DD-MM-YYYY '), ' ')
    insert into mbr2 values ('MARTY', 'A', to_date (June 1, 2012 ',' DD-MM-YYYY '), to_date (31 December 2013 ',' DD-MM-YYYY '), 'C')

    insert into mbr2 values ('FRANK', 'B', to_date (January 1, 2011 ',' DD-MM-YYYY '), to_date (June 30, 2012 ',' DD-MM-YYYY '), ' ')
    insert into mbr2 values ('FRANK', 'B', to_date (July 1, 2012 ',' DD-MM-YYYY '), to_date (31 December 2013 ',' DD-MM-YYYY ""), 'C')

    insert into mbr2 values ('MARY', 'B', to_date (January 1, 2011 ',' DD-MM-YYYY '), to_date (June 30, 2012 ',' DD-MM-YYYY '), ' ')
    insert into mbr2 values ('MARY', 'B', to_date (July 1, 2012 ',' DD-MM-YYYY '), to_date (31 December 2013 ',' DD-MM-YYYY ""), 'C')

    insert into mbr2 values ('JOHN', 'C', to_date (January 1, 2011 ',' DD-MM-YYYY '), to_date (July 1, 2011 ',' DD-MM-YYYY '), ' ')
    insert into mbr2 values ('JOHN', 'C', to_date (July 1, 2011 ',' DD-MM-YYYY '), to_date (1 January 2012 ',' DD-MM-YYYY ""), 'C')

    insert into mbr2 values ("NOAM" 'd', to_date (July 1, 2012 ',' DD-MM-YYYY '), to_date (31 December 2013 ',' DD-MM-YYYY '), ' ' ")

    Commit

    This gives you a report for the current month and the two front. Column header must be adjusted ;-)

    select
      count(
      case
      when
        eff_dt < add_months(trunc(sysdate,'MM'), -1)
        and
        exp_dt >= add_months(trunc(sysdate,'MM'), -2)
      then 1
      end) April
    , count(
      case
      when
        eff_dt < add_months(trunc(sysdate,'MM'), 0)
        and
        exp_dt >= add_months(trunc(sysdate,'MM'), -1)
      then 1
      end) May
    , count(
      case
      when
        eff_dt < add_months(trunc(sysdate,'MM'), 1)
        and
        exp_dt >= add_months(trunc(sysdate,'MM'), 0)
      then 1
      end) June
    from mbr2
    where
    elg_code = ' '
    and
    eff_dt < add_months(trunc(sysdate,'MM'), 1)
    and
    exp_dt >= add_months(trunc(sysdate,'MM'), -2)
    
    APRIL     MAY     JUNE
    4     3     2
    
  • Find a VM (from the data store, file, the pool and etc..) ID name

    Hello:

    I wonder if there is a way to find the virtual computer (of esx, the data store, file, the pool and etc..) ID name of vSphere CLI power.

    The goal is to find the virtual if name machine ID if known (and the same for esx datastore, record pool and etc..).

    Thank you

    Olegarr

    Hello

    According to my understanding, we can get the managed object ID of the virtual machines,

    $serv = connect-VIServer-Server

    Write-Output $vm. ID

    The same with others, like first Get-VMHost, store in a variable and then the variable. ID gives you the code.

    I hope this helps.

  • While the loop records all the data

    Hello

    I have a LabView program containing two while loops. The first is used to increment a frequency on our machine and the second, which is located in the first, used to take a certain number of points of data to this frequency. After taking these data points, the frequency is supposed to rise again. This happens for a set number of times.

    The problem I have is that the data file as the data points in the second then the loop are supposed to be written in does not display do not all the data points in the program should have taken. For example, if we want the first loop increment of 10 times and the second to take 5 data points every time, we expect to see 50 data points in our files. But we see only 10. The watch file always only an amount of data points equal to the first loop iterations. If we cut our data at that time.

    Someone has an idea what could be our problem?


  • How do to find duplicate records in a table, then delete them.

    Hi all

    I'm working on a database of GR 11, 2 under linux. Recently, we have created a unique index on two inplace of columns in a single-column index. When we try to create this index in pre-production and prod to get an error message saying that the "double values found. Now my team asked me to write a pl/sql package or procedure to find these duplicate values and remove it or any other way to do it for them as well. But I'm not familiar with stuff of PL/SQL or data level how to perform this task.
    Please help me on this issue, how can I proceed.
    Thanks in advance for your help.

    Try this:

    
    CREATE TABLE z_test2
    AS
       SELECT 1 a, 'aaa' b FROM DUAL
       UNION ALL
       SELECT 1 a, 'aaa' b FROM DUAL
       UNION ALL
       SELECT 1 a, 'bbbb' b FROM DUAL
       UNION ALL
       SELECT 12 a, 'aaa' b FROM DUAL
       UNION ALL
       SELECT 12 a, 'aaa' b FROM DUAL
       UNION ALL
       SELECT 12 a, 'aaa' b FROM DUAL
       UNION ALL
       SELECT 13 a, 'aaa' b FROM DUAL;
    
    DELETE FROM z_test2 x
          WHERE EXISTS
                   (SELECT '*'
                      FROM (SELECT a,
                                   b,
                                   ROW_NUMBER ()
                                      OVER (PARTITION BY a, b ORDER BY a)
                                      rn
                              FROM z_test2) y
                     WHERE x.a = y.a AND x.b = y.b AND rn > 1);
    
                     
    
  • Alternative to find duplicate records

    Hello

    My requirement is to find records in doubles of the sub sample. But it takes more time to generate the output when you work for about 10 lakh of records.

    Is there an alternative approach without the help of the JOIN. Thanks in advance

    with aaa as

    (select 101 as id, seq 1, "Asthma" as an event, 'medical' as an union journalist double

    Select 101, 3, 'asthma', 'medi' Union double

    Select 101, 2, 'lag', 'meddi' Union double

    Select 102.2, "whooping cough", "LP" of double union

    Select 102.1, "whooping cough", "LPS" double Union

    102.4 select, "whooping cough", "LPWS' Union double

    Select 102.3, 'ddd', 'dd' double Union

    Select 103, 1, 'asthma', ' Union double

    Select 103, 2, 'asta', have ' Union double

    Select 104,2, "whooping cough", "xx" of the double

    )

    Select x.* from aaa x,

    (SELECT id, event, count (*)

    by aaa

    Group by id, event

    Having count (*) > 1

    ) b

    where x.id = b.id

    and x.event = b.event

    something along the lines

    with aaa as
    
    (select 101 as id, 1 as seq, 'asthma' as event, 'medical' as reporter from dual union
    
    select 101, 3, 'asthma', 'medi' from dual union
    
    select 101, 2, 'lag', 'meddi' from dual union
    
    select 102,2, 'whooping', 'LP' from dual union
    
    select 102,1, 'whooping', 'LPS' from dual union
    
    select 102,4, 'whooping', 'LPWS' from dual union
    
    select 102,3, 'ddd', 'dd' from dual union
    
    select 103, 1, 'asthma', 'm' from dual union
    
    select 103, 2, 'asta', 'm' from dual union
    
    select 104,2, 'whooping', 'xx' from dual
    
    )
    select * from (
    select aaa.* , count(*) over (partition by id,event) rn from aaa
    ) where rn > 1;
    

    Hope this helps

    Alvinder

  • Between the records in the data base - year wise

    Hi Experts,

    I have a database and I want to separate year files wise in a form of database.

    A database for 2013

    A database for 2014 and so on...

    There is currently no procedure for separation of records.

    Reports are increasing day after day.

    I intend to do this by querying each table and storing a different database.

    What a way to make?

    Cecile

    It takes less storage to add storage to a database it does to create a separate database for each year of data you want to store, so your argument to save cost and space is invalid.  It is not feasible; a partitioned table would be a good approach, as has already suggested it.

    David Fitzjarrell

  • How to stop excel from opening during the recording of the data?

    Hello

    I am trying to save data in an excel file and I constantly use the same file to update the information. The problem is, every time that I update or create an excel file, it actually opens an excel window. I want to create and save data without opening it for the user!

    I hope I can clear my problem...

    Below you can take a look at some of the screws I use to create and uptade information.

    My comments are in Portuguese, so I gave you the hole VI, so you can use the help option to get what is happening.

    Thank you for the attention!

    OK, so... I got the answer for my problem... I worked on it for about 2 days and now I asked I understand it.

    If someone had a similar problem, here is the solution:

    The new report VI, set "no change" to the State of the window.

    Simple like that xD

  • Advised to make soft the system restore... can not find place to enter the date of the restoration

    Where can I find this day soft system restore restore

    Hello

    How to make a Vista system restore
    http://www.Vistax64.com/tutorials/76905-System-Restore-how.html
    I hope this helps.

    Rob Brown - MS MVP - Windows Desktop Experience: Bike - Mark Twain said it right.

  • Recording of the data via vSphere Client store VMs

    I'm trying to migrate some VMWare Server 2 virtual machines to ESXi but can't see how to import virtual machines of the ESXi Datastore inventory by the vSphere Client. Am I missing something obvious here, or this can be done through the command line?

    Hi Mark,

    A simple right-click on the vmx file should be used. Take a look at the following kb:

    It will be useful.

    Concerning

    Franck

  • How to find duplicates in the table

    I have a table with 3 columns

    name of the table - used

    empcode firstname lastname
    XYZ 123 pk
    yzz 456 pk
    101 kkk jk


    ALTER TABLE employee
    ADD (CONSTRAINT PRIMARY KEY employee_PK
    (empcode, firstname, lastname))


    all the three columns are as key to porimary, we are migrating the data, there are problems with the data as the cobination of all three, resulting in duplicate, in the last column is supposed to be duplicates but first two columns will not have the duplicate and a complete line of the table (combination have no duplicates)

    a query need to find duplicates to validate the whole lines

    (B-)

    select empcode,firstname,lastname,count(*)
    from employee
    group by empcode,firstname,lastname
    having count(*)>1;
    

Maybe you are looking for