Sunday, April 14, 2013

[how to] SSIS Merge Join Trouble: 3 tables into 1

[how to] SSIS Merge Join Trouble: 3 tables into 1


SSIS Merge Join Trouble: 3 tables into 1

Posted: 14 Apr 2013 09:25 PM PDT

So I am having trouble merging three tables into one. All sources are DB, and I am using the MERGE JOIN function. When I join my first two tables, using standard left inner join, it returns the exact same number of rows as in the left (and largest) table, which is what i expect.

Here's where my trouble begins. I then apply another sort on the newly created table because I have to use a different sort key, as none of the three tables have common columns. When I attempt the second merge, it will count the first ~50k rows from the sort, stop counting, and continue to insert more rows into the destination table. I end up with a great many more rows than from either of the original tables. The highest source table has 3.3 million, but letting it run for a few hours generated over 800 million rows! Here is what the data flow looks like:

!http://i.stack.imgur.com/gilTh.png

I'm sure this is either a lack of personal understanding of what I'm doing, or I need to take a different approach in SSIS to eliminate this cartesian product situation....

Any help would be appreciated!

Which MySQL directories to back up using FTP

Posted: 14 Apr 2013 08:38 PM PDT

My server has been hacked and I only have FTP access to recover my data before reinstalling the OS.

Which/where are the directories that needs to be backed up? Will MySQL be recovered back to its initial state if I were to copy these directories back to the freshly installed server?

View serializability and General serializability

Posted: 14 Apr 2013 07:50 PM PDT

Would anyone know if every serializable schedule is necessarily view-serializable? If not, what would be an example of a serializable schedule that is not view-serializable?

Thanks in advance.

Why does pgAdmin3 set OIDS=FALSE and owner to postgres by default?

Posted: 14 Apr 2013 06:41 PM PDT

Why does pgAdmin3 automatically set the table owner to postgres when its created and set OIDS=FALSE? I am new to PostgreSQL and want to know the rational behind this default behavior.

-- Table: staging.mytable    -- DROP TABLE staging.mytable;    ...    WITH (    OIDS=FALSE  );    ALTER TABLE staging.mytable  OWNER TO postgres;  

SQL SEREVER TRIGGER

Posted: 14 Apr 2013 06:04 PM PDT

I want this trigger to be fired after an insert is made with the text married on the marital_status column, this is what I have so far

ALTER TRIGGER [dbo].[marriage]  ON  [dbo].[applicant_personal_info]  AFTER INSERT  AS   BEGIN  -- SET NOCOUNT ON added to prevent extra result sets from  -- interfering with SELECT statements.  SET NOCOUNT ON;  IF (SELECT [marital_status] FROM inserted) = 'married'   BEGIN       INSERT INTO [dbo].[applicant_marriage_info]([dom])       VALUES('abc')   END  END  

SQL server disk write queue dramatically grows sometimes

Posted: 14 Apr 2013 09:05 PM PDT

I have SQL Server 2008 and a single database with full recovery model. Usually, the queue length is less than 1. But sometimes it grows up to several thousands (3000!!) for a few seconds. At this time many of write queries ends up with timeout error.

Using Resource Monitor, I found that at this moment sqlserver.exe writes a large amount of data to the main database file (MDF). Thoughh usually it writes to transaction log (LDF).

Using SQL Server Profiler, I found that heavy queries are not running at that moment.

I think, that it is some kind of SQL server's background operation, but I wonder what kind?

Database also has READ_COMMITED_SNAPSHOT ON and the mirroring (synchronous mode) enabled. Can this fact be the cause of my issue?

UPDATE: I found that writing to log (not to data file) is default behavior of Full recovery mode. And log can only be copied to data file by Backup Transaction Log operation. Still don't understand why SQL server copying log every ten minutes...

What are the disadvantages if indexes are added on all columns?

Posted: 14 Apr 2013 06:31 PM PDT

I know, that it is not a good design to set too many indexes, but just for understanding the theory:
What exactly are the disadvantages if you add an index on all columns in a table in MySQL?

  • Are the queries getting slower if I have indexes on columns, that i rarely use?
  • Are the inserts somewhat slower?
  • Is MySQL able to pick out the best index that speeds up the query?

MySQL: replicating to a different table engine type

Posted: 14 Apr 2013 08:53 PM PDT

According to the replication documentation from MySQL, it is possible to set up replication from InnoDB source tables to MyISAM destination tables. Unfortunately, the documentation has little to say about drawbacks, data consistency implications (apart from the CASCADE corner case) and recommended settings for enabling such a replication configuration. So just a number of questions come to my mind regarding this setup:

  1. is statement-based replication preferred over the row-based variant or vice-versa?
  2. does either the InnoDB or the MyISAM side need some specific settings so data consistency would not break all by itself as the result of "normal" replication activity?
  3. are DDL executions on the master handled on the slave in a sane manner?
  4. how would I prevent an ALTER TABLE blah ENGINE=InnoDB statement run at the master from propagating to the slave?
  5. Is there a recommended way of setting up such a "split" setup? I could think of a direct ALTER TABLE blah ENGINE=MyISAM on the slave, is this viable?
  6. any other caveats one should know about?

How to learn oracle business intelligence enterprise (OBIE) from scratch? [closed]

Posted: 14 Apr 2013 01:27 PM PDT

Hi I am a electronics and communication graduate but i want to learn Oracle Business Intelligence Enterprise(OBIE).I have some theoretical knowledge of RDBMS and SQL which i learnt during my graduation.So I want to know how should i start to learn OBIE from scratch?

speeding up a query on MySql

Posted: 14 Apr 2013 12:22 PM PDT

I have a table with more than 10 million rows and 10 fields(columns). There is an index on field_1 and I am running the following query.

create table t2   select field_1,         sum(ifnull(field_2,0)) as field_2,          sum(ifnull(field_3,0)) as field_3,         sum(ifnull(field_4,0)) as field_4   from t1   group by field1;  

The data type of all the columns is varchar(200).

This query is not able to produce the result even after running for more than 1 day. Any suggestions on getting the results quickly will be helpful.

To find a version for your pc check with the software publisher install mongodb

Posted: 14 Apr 2013 11:07 AM PDT

I want to install MongoDb on windows 8 but when i try to run C:\mongodb\bin\mongod.exeiget this error:

To find a version for your pc check with the software publisher.  

database design- diagrams

Posted: 14 Apr 2013 06:52 AM PDT

I'm trying to derive an Entity Relation diagram from a class diagram. In the class diagram, I have a class Player with one to many relationship to another class Payment. In Payment there is no attribute 'paymentID' and 'playerID' is foreign key. Since the same player ('playerID') can make the same payment many times, I thought of adding an attribute 'paymentID' to the Payment table. Is this right? Will I still follow the requirements? I'm new to all these, thank you for any help.

Do I need to reenter old data after adding an index to a table?

Posted: 14 Apr 2013 05:09 AM PDT

I want to add index to my tables. Some of my tables already have couple thousand rows.

Do I need to reenter my stored data after adding index to columns (to make them aware of index/so the indexing affect them as well or it's going to take care of old data itself ?

Database design: Dividing multiple identical tables, good or bad?

Posted: 14 Apr 2013 04:45 AM PDT

I am very new at SQL and databases in general. I only use them for the occasional homework so I haven't even tried to master them.

I have seats at a theater, the seats are divided into 4 main areas (A, B, C, D). Each area has the same number of rows and the same number of seats per row.

In my database, I'd like to have Row + SeatNumber as a compound primary key, and to have one table for each area.

Now, I don't yet know how I'll do my selects, but what I want to ask: If I do it this way, will my selects be doable ? I want to, for example, select an exact position within the theater (where I know the area, row and seat number).

Would the 4 tables be a hindrance ? Could you give an example of how such a "select" might look ?

P.S. This is my first time at the site, if the question does not belong here, please direct me to a more suitable site within stack exchange.

Can a production Oracle DB server also have a dataguard active standby for a different production database?

Posted: 14 Apr 2013 06:18 AM PDT

If you have a production oracle database (PRIM1) protected using Oracle Dataguard to a standby (STBY1) physical server in a different data centre, the standby server needs to be a similar specification to the primary server, so that in the event of a disaster it can become the primary.

This setup means that you have redundant hardware 'just in case'.

If you have another production database (PRIM2), could you install this on the same physical server as STBY1 of the first application and install the standby for this application (STBY2) on the primary server of the first application (PRIM1).

This setup would mean maximising use of the hardware and in the event of a failover, both primary servers might be running on the same machine so you might need additional memory and/or CPU available to provision in the event of a disaster but you would be using the hardware most of the time.

Any thoughts on the issues a setup like this would create ?

Database design: Dividing multiple identical tables, good or bad? [duplicate]

Posted: 14 Apr 2013 06:50 AM PDT

This question already has an answer here:

I am very new at SQL and databases in general. I only use them for the occasional homework so I haven't even tried to master them.

I have seats at a theater, the seats are divided into 4 main areas (A, B, C, D). Each area has the same number of rows and the same number of seats per row.

In my database, I'd like to have Row + SeatNumber as a compound primary key, and to have one table for each area.

Now, I don't yet know how I'll do my selects, but what I want to ask: If I do it this way, will my selects be doable ? I want to, for example, select an exact position within the theater (where I know the area, row and seat number).

Would the 4 tables be a hindrance ? Could you give an example of how such a select might look ?

Oracle error handling strategy

Posted: 14 Apr 2013 03:02 AM PDT

I am developing a project using oracle. I am writing functions and stored procedures to handle CRUD statements.

My question is in addition to oracle check constraints do i have to check error situations myself or let oracle do the job?

For example if i have a column in a table which is unique and i want to insert into this table i have to check the value to be unique or let the oracle unique constraint do the job and produce an error if it is a repeated value?

Thx in advance.

read multiple values from same column

Posted: 14 Apr 2013 06:17 AM PDT

how to read multiple values in same column and campare it with the same table id and other table to get its record.. i have a table with column name pid ,authors ,citing papers 10.1.1,abc,10.1.2,10.3.2 i want to read citing papers ids sapartly and compare it with pid if it exist in same table then its ok otherwise get its record from another table through same comparison of pid... thanks

How can I improve this nested MySQL query to avoid redundant WHERE statement?

Posted: 14 Apr 2013 04:34 AM PDT

I am trying to improve on a nested MySQL query, using the following table structure (omitting a few dozen not relevant to my question):

tbl_users:
ID | email (all subscribers to our email list)

tbl_mailings:
ID | mail_title (each marketing email)

tbl_sends:
ID | user_id | mailing_id (each user who was sent each mailing)

tbl_opens:
ID | user_id | mailing_id (each user who opened each mailing)

tbl_actions:
ID | user_id | mailing_id (each user who took the target action on each mailing)

I inherited this query that I have been using to get the open rate and action rate for each mailing:

SELECT subq_opens.numopens/subq_sends.numsends as Open_Rate,         subq_actions.numactions/subq_sends.numsends as Action_Rate,  FROM  (SELECT COUNT(DISTINCT sends.userID) AS numsends      FROM tbl_sends WHERE mailing_id = 5694) AS subq_sends,    (SELECT COUNT(DISTINCT opens.userID) AS numopens      FROM tbl_opens WHERE mailing_id = 5694) AS subq_opens,    (SELECT COUNT(DISTINCT actions.userID) AS numactions      FROM tbl_actions WHERE mailing_id = 5694) AS subq_actions  

Not too elegant, but functional, even though I have to change the mailing ID in multiple places every time.
However, now I need to change that WHERE statement to something a great deal more complicated, and I don't want to have to repeat the WHERE statement so many times.

Is there a way I can reformulate this without the redundant WHERE?
I have tried various nesting schemes, but can't seem to find a way to get it right. I can't change anything about the table structure; that is set in stone. Thank you in advance for your help.

In Postgres, How can I merge two tables?

Posted: 14 Apr 2013 11:17 AM PDT

I have a set of web services that all need updating with a new set of static data.

I can't use a distributed transaction, but would like to be able to ask each service to rollback it's changes if I detect something went wrong.

I aim to ask each service to backup the data I care about, and apply the update... then later ask the service to restore the backup if I need to recover.

To complicate things I may only be backing up some of the data in the table (let's say.. where the ID starts with "My." so "My.123" will be backedup and restored but "YOURS.345" won't be changed by this process at all.

I'd like to make backup and restore functions that take a table name.

I was thinking of backing up tables with a generic version of the following code:

--=== BACKUP ===--  create table utilities.labpreferences_snapshot ( LIKE utilities.labpreferences EXCLUDING DEFAULTS ) WITHOUT OIDS;    -- copy data  insert into utilities.labpreferences_snapshot  select * from utilities.labpreferences where s.id like 'My.%';  

When restoring, I am expecting that some rows may have been modified or deleted or new ones created. I want to make sure I don't break andy foreign keys referencing data in this table, but I don't expect any new foreign keys since the backup state was captured.

I could restore the data in the same way I copy it... but INSERT only works for data that was deleted... so.. I could delete all the data I want to replace..

--option1   --=== RESTORE ===--  -- possibly stop foreign key constraint checking?    -- remove master data  DELETE FROM utilities.labpreferences s where s.id like 'My.%';    -- replace snapshot data  insert into utilities.labpreferences  select * from utilities.labpreferences_snapshot;    -- restart foreign key constraint checking somehow?  

but I worry about the foreign keys.

I could delete the added ones, add the deleted ones, and update the modified ones.

--option2   --=== RESTORE ===--    DELETE FROM utilities.labpreferences s where s.id like 'My.%' and s.id not in (select id from utilities.labpreferences_snapshot);    insert into utilities.labpreferences  select * from utilities.labpreferences_snapshot where s.id not in (select id from utilities.labpreferences);    Update ???  -- maybe something like http://stackoverflow.com/questions/3891758/how-to-update-one-table-from-another-one-without-specifying-column-names  

The update looks hard.

I'm feeling a little out of my depth. Can anyone suggest a solution?

MySQL backup InnoDB

Posted: 14 Apr 2013 12:01 PM PDT

I have a VoIP server running 24x7. At low peak hour at lease 150+ users are connected. My server has MySQL running with InnoDB engine on Windows 2008 platform. I like to take at least 2 times full database backup without shutting down my service.

As per Peter Zaitsev - the founder of percona, mysqldump –single-transaction is not always good.

read here if you are interested

As I'm not a DBA, I like to know in my scenario, which would be best solution to take a database backup?

Thanks,

SQL Server 2008 R2 replication high delivery latency

Posted: 14 Apr 2013 01:01 PM PDT

I am seeing an unusually high delivery latency between our distributor and subscribers and i do not understand why.

We have in this configuration 3 sql servers using transactional push replication to replicate data from one master server to two reporting servers.

We have 9 publications. The distribution agent for most publications are showing under 5ms but one is show as 2000+ms to both subscribers.

The suspect publication has only 4 small articles (tables) that rarely, if ever, change. Ive checked and each table has an primary key.

ive also checked the @status parameter for each article according to the MS KB: The distribution agent may experience high latency when you configure transactional replication with articles that are configured not to replicate changes as parameterized statements

Im tempted to start droping articles to find out if one particular table is the culprit.

Doe anyone have any suggestions as to what I can look at?

How can I verify I'm using SSL to connect to mysql?

Posted: 14 Apr 2013 03:01 PM PDT

I have configured my server to allow SSL, and have modified my client ~/.my.cnf so I use SSL:

[client]  ssl  ssl-cipher=DHE-RSA-AES256-SHA  ssl-ca=~/certs/ca-cert.pem  

When I log in with my client and view the status, it lists a cipher on the SSL line:

mysql> \s  --------------  SSL:            Cipher in use is DHE-RSA-AES256-SHA  

Without installing something like wireshark to verify that the connection is secure, can I assume that I'm connecting via SSL based on this information?

Strange characters in mysqlbinlog output

Posted: 14 Apr 2013 10:01 AM PDT

Has anyone experienced this? Data replicates fine but when output in mysqlbinlog there are hidden characters that break the input?

  • mysqlbinlog Ver 3.3 for Linux at x86_64
  • mysql 5.5.28 server

Thanks! Julie

Proper indexes or anything else to optimize

Posted: 14 Apr 2013 10:22 AM PDT

I need help to optimize the query to avoid using "Using temporary", "Using filesort".

CREATE TABLE `target_bahrain_pepsi` (         `id` int(11) DEFAULT NULL,        `col_value` varchar(50) DEFAULT NULL,           `source_file_id` int(11) DEFAULT NULL,         KEY `id` (`id`)   ) ENGINE=InnoDB        SELECT su.adbrand_id, sum(st.col_value) as amount   FROM statex_with_id su  INNER JOIN target_bahrain_pepsi st  ON st.id = su.id   GROUP BY su.adbrand_id   ORDER BY amount DESC ;  

Table statex_with_id has also index.

mysql> EXPLAIN select su.adbrand_id, sum(st.col_value) as amount      -> from statex_with_id su      -> INNER JOIN target_bahrain_pepsi st      -> ON st.id = su.id GROUP BY su.adbrand_id ORDER BY amount DESC ;       +----+-------------+-------+--------+---------------+---------+---------+------------------+---------+----------------------------------------------+   | id | select_type | table | type   | possible_keys | key     | key_len | ref              | rows    | Extra                                        |   +----+-------------+-------+--------+---------------+---------+---------+------------------+---------+----------------------------------------------+   |  1 | SIMPLE      | st    | index  | id            | id      | 58      | NULL             | 1804021 | Using index; Using temporary; Using filesort |   |  1 | SIMPLE      | su    | eq_ref | PRIMARY       | PRIMARY | 4       | dashboard1.st.id |       1 | Using where                                  |   +----+-------------+-------+--------+---------------+---------+---------+------------------+---------+----------------------------------------------+  

Connecting to a SQL Server database from a Flash program

Posted: 14 Apr 2013 11:01 AM PDT

I currently have the ability to utilize Microsoft SQL Server 2012. I am developing a project with Adobe Flash Builder 4.7.

If I link my database with Adobe Flash Builder is there any additional steps I must take in order to make the database live, or as long as my computer is running will this database be accessible from any device that is utilizing it?

In other words is this a LAN only system or does it automatically make itself available for the programs I link to it?

Oracle Express edition on Ubuntu - control file missing

Posted: 14 Apr 2013 02:01 PM PDT

I have installed the Oracle Express edition on Ubuntu as mentioned here.

I am facing issues when I try to create a sample table.

Started oracle

$ sudo service oracle-xe start   Starting Oracle Database 11g Express Edition instance.  

Started sqlplus

$ sqlplus / as sysdba  

Executed the CREATE command

SQL> CREATE TABLE SAMPLE (ID NUMBER);  CREATE TABLE SAMPLE (ID NUMBER)  *** ERROR at line 1: ORA-01109: database not open**  

After a series of research on web, I tried to shutdown and restart oracle:

Shutdown command

SQL> shutdown  ORA-01507: database not mounted  ORACLE instance shut down.  

Started the oracle instance

SQL> startup    ORACLE instance started.  Total System Global Area  688959488 bytes Fixed Size                   2229688 bytes Variable Size             411044424 bytes Database  Buffers          272629760 bytes Redo Buffers                3055616  bytes ORA-00205: error in identifying control file, check alert log  for more info  

I realized that the control file is missing at /u01/app/oracle/oradata/XE. XE is the DB name.

So I tried to create the control file as follows:

SQL> CREATE CONTROlFILE SET DATABASE XE RESETLOGS;    Control file created.  

Tried to create the sample table again

SQL> CREATE TABLE SAMPLE(ID NUMBER);      CREATE TABLE SAMPLE(ID NUMBER)    ERROR at line 1: ORA-01109: database not open  

So I tried to issue the following command

SQL> ALTER DATABASE OPEN RESETLOGS;  ALTER DATABASE OPEN RESETLOGS    ERROR at line 1:  ORA-01194: file 1 needs more recovery to be consistent**  ORA-01110: data file 1: '/u01/app/oracle/product/11.2.0/xe/dbs/dbs1XE.dbf'  

What should be done next? I am clueless as I am not a database guy.

Note:

Output of

$ lsnrctl services    LSNRCTL for Linux: Version 11.2.0.2.0 - Production on 04-JAN-2013 09:15:37    Copyright (c) 1991, 2011, Oracle.  All rights reserved.    Connecting to (DESCRIPTION=(ADDRESS=(PROTOCOL=IPC)(KEY=EXTPROC_FOR_XE)))    Services Summary...    Service "PLSExtProc" has 1 instance(s).   Instance "PLSExtProc", status UNKNOWN, has 1 handler(s) for this service...      Handler(s):        "DEDICATED" established:0 refused:0           LOCAL SERVER  Service "XE" has 1 instance(s).    Instance "XE", status READY, has 1 handler(s) for this service...      Handler(s):        "DEDICATED" established:0 refused:0 state:ready           LOCAL SERVER  The command completed successfully  

No comments:

Post a Comment

Search This Blog