Monday, July 22, 2013

[how to] Cannot create FOREIGN KEY for a table

[how to] Cannot create FOREIGN KEY for a table


Cannot create FOREIGN KEY for a table

Posted: 22 Jul 2013 08:56 PM PDT

I'm not a professional DBA, I encountered a problem when I's attempting to create FOREIGN KEY for a specific table by the following SQL script (I'm using MySQL):

ALTER TABLE mmy_answer ADD CONSTRAINT fk_answer_userId FOREIGN KEY (userId)   references mmy_user_account(id);  

I'm sure there is no FOREIGN KEY named 'fk_answer_userId' with table mmy_answer and refered table mmy_user_account exists, the following is the error:

Error Code: 1452. Cannot add or update a child row: a foreign key constraint fails   (`marry0528`.<result 2 when explaining filename '#sql-734_5'>, CONSTRAINT   `fk_answer_userId` FOREIGN KEY (`userId`) REFERENCES `mmy_user_account` (`id`))  

Can anybody give me help? Thanks.

"Alter table on a small table(1000 rows)" execute very slow on mysql slave db while within 1 second on master db?

Posted: 22 Jul 2013 07:45 PM PDT

I added a column on a very small table(1000 rows). It was executed very fast on master db and one slave db(this is a backup slave) within 1 second, while over 600+ seconds the 'alter sql' hadn't finished on my main slave db(almost all read request). I want to know what may make this happen.

At last I restarted this slave db.

Plan for storing transactions in an easily retrievable manner, indefinitely

Posted: 22 Jul 2013 06:55 PM PDT

I've got a large (many columns) MySQL database table (InnoDB) that has a fairly high number of INSERTs (~500/day). Think of the records as financial transactions. Clients need to be able to view these records and/or retrieve a report containing details about the transactions through an online system (going back as far as possible).

My question is this: Is it feasible to allow these records to accumulate and simply SELECT them from the table when preparing a listing, even accounting for significant growth in volume? The traffic would be relatively low.

We've thought of implementing a clever scheme of separating the table into years; however, that would add significantly to the complexity of the project. Assuming the table was carefully optimized and indexed adequately, would the table still be fast enough at 1,000,000+ records per year?

I hope this is not too vague or specific to warrant an answer and would like to thank those who take the time to consider responding.

Mysterious missing transactional replication tokens

Posted: 22 Jul 2013 02:29 PM PDT

I am looking at implementing Transactional Replication subscriptions in an AlwaysOn AvailabilityGroup. I tried to use this as a guide:

https://www.simple-talk.com/sql/backup-and-recovery/expanding-alwayson-availability-groups-with-replication-subscribers/

The reason I say tried is not in anyway besmirching the author - in this case he is dealing w/ push subscriptions and we are doing pull. There was a spectacular flame out trying to Initialize from LSN (The transactions required for synchronizing the subscription with the specified log sequence number (LSN) are unavailable at the Distributor. Specify a higher LSN. ) (perhaps b/c this is a pull subscription and it moved on, whereas in the example the push agent was disabled and then brought up on the now Primary?). Following this I just 're-initialized' the subscription (it is not that big a publication).

Anyhow, now Replication Monitor, which from experience can be an unreliable narrator up there with the greats of literature, tells me this subscription's performance is 'Excellent' and that it was last synchronized 1 minute ago as of this writing, yet when I try to insert Tracer Tokens to get a feel for latency this subscription is nowhere to be seen. In other areas of Replication monitor or checking out the SQL Server Agent jobs there are no big oddities.

What is going on? Is using pull as opposed to push subscriptions not advised in this scenario? I am somewhat befuddled at the moment and any enlightenment would be welcomed and appreciated.

Recommended datatype fixed length in mysql

Posted: 22 Jul 2013 03:06 PM PDT

What's the best data type for store fix length data such as

  • MD5 password 1f3870be274f6c49b3e31a0c6728957f
  • HEX Ccolor f2a709

They dont need character sets. All in absolute[a-z0-9]`

I use BINARY data type for that. Is this a good choice or not ?

Splitting a large SQL Server MDF file

Posted: 22 Jul 2013 03:57 PM PDT

I have a large (1.2 terabyte) SQL Server database that I need to migrate to a new server. Most of the database lives on a single, 1.25 TB data file, and a little bit sits on a much-more-manageable 550 GB file (which is practically empty).

Now, the tricky bit: the server to which I'm migrating only has 3 700 GB volumes, meaning I need to somehow dissect this goliath into three equal chunks. Most advice I've found involves creating 3 target files and running DBCC SHRINKFILE EMPTYFILE on my main file to empty it into the targets, but that'd take ages with a database this large.

Is there a recommended method for splitting a database this large? I'm considering using the Sql Server Integration Services Data Export feature to dump the data into a clone database with the proper file structure, but I'm curious as to whether there's a better way.

mysqldump: replace drop table in output with delete?

Posted: 22 Jul 2013 01:37 PM PDT

I am using mysqldump to dump some records in a table:

mysqldump mydatabase mytable --where="field=value"  

this emits script that includes drop table mytable statement.

Since I use the --where option, when I replay the script, I want to overwrite only these records that satisfy the condition field=value. I want the emitted script to delete just these records:

delete from `mytable` where field=value  

is there a way to achieve this?

Connecting to postgres from windows to virtual machine

Posted: 22 Jul 2013 01:43 PM PDT

I have the following setup

  • Windows 8
  • Oracle VM virtual box with arch Linux running on it
  • Installed and configured PostgreSQL on Linux
  • Setup the ports in virtual box so that
    • 45432 port on windows goes to 5432 on Linux
    • 40022 port on windows goes to 22 on Linux
  • On Linux PostgreSQL, I have created user erjik and database mydb.
  • Firewall is shut down on windows.

The problem is that I am trying to connect from intellij idea using JDBC PostgreSQL driver, just trying test connection to address:

jdbc:postgresql://127.0.0.1:45432/mydb    username: erjik  password: xx  

However, the test connection fails.

pg_hba.conf goes unchanged after installation with following settings:

local all all trust  // IPv4  host all all 127.0.0.1/32 trust  // IPv6  host all all ::1/128 trust  

In postgresql.conf I have set only this settings:

listen_addresses = '*'  

and

port = 5432  

What could be the reason of connection attempt fail? Maybe I missed some settings.

EDIT1: on Linux side the command:

psql -h localhost -U erjik mydb   

executes successfully.

Where to install Oracle client software on SharePoint 2010 Server

Posted: 22 Jul 2013 01:40 PM PDT

We have our ASP.net web application hosted on a SharePoint Server 2010. Our application uses oracle database to save and retrieve data.For our application to work on SharePoint server we need to install Oracle client software.

Our SharePoint Server architecture consists as following

  • 2 web front ends
  • Index Server
  • Crawl Server
  • Application server
  • Load Balancer.

I am not the admin on the SharePoint server and trying to figure out if i have to install Oracle client on server where do i install it? Do i need to install it on all our servers or just the web fronends. Can some one please help me with this?

1062 Duplicate entry but there are no duplicates?

Posted: 22 Jul 2013 12:12 PM PDT

I keep getting this error: failed to INSERT: [1062] Duplicate entry 'Upping Your Type Game-http://jessicahische.is/talkingtype' for key 'PRIMARY'.

The problem is, is that there are no duplicate entries. Which is why I'm confused when receiving that error.

Is there any way to check using InnoDB?

I have tried exporting the table and reading the SQL code and ran a search for any duplicates but there wasn't any.

Bus time schedule database design

Posted: 22 Jul 2013 06:27 PM PDT

Hi I'm trying to figure out how to design a database for bus time schedule.

There are several lines, each line have two routes (forward and back), each route have several stations (stops) and each station may belong to several routes.

The main goal of the database is to easily select the next arrival time for a specific bus station with a specific line.

I googled and came up with next database design:

lines (id, name, ...)

routes (id, name, line_id, ...)

stations (id, location)

route_station (id, route_id, station_id)

times (id, station_id, line_id, time_forward, time_back)

So, for instance

times Table  id  station_id   line_id   time_forward  time_back      1       10          100         9:15         9:18  2       10          100         9:30         9:33  3       10          100         9:45         9:48  4       10          100         10:00        10:03  5       10          100         10:15        10:18  6       10          100         10:30        10:33  7       10          100         10:45        10:48    8       20          100         9:20         null  9       20          100         9:35         null  10      20          100         9:50         null  11      20          100         10:05        null  ...  45      40          200         9:55         null  46      40          200         9:09         null  ...     

In the first 7 rows, the line with line_id=100 stops at different times in the station with station_id=10 (The null in time_back indicates that there is no backward route that passes by station_id=20).

My questions are:

  1. I'm not sure if this design is right at all. I'm no looking for performance. I just want to know if I will have problems with this design in future time (For example: is easy to insert a new line station? Is easy to select the next arrival time for a bus station with specific line station?
  2. How I deal with different time frequencies? Because is different for weekdays and weekends. If my design is ok, then the example above will work fine with arrival times in weekdays but no for weekends. I need a second table or add a column in times table indicating the type of the day?

Thanks.

P.S. I've taken a look at this design, but I think is too much for my system.

P.S.2 I've read about using NoSQL databases may fits better for this kind of systems; but my knowledge is limited to relational databases.

Constraint on Composite Type

Posted: 22 Jul 2013 01:07 PM PDT

How can I create a constraint on a sub-field of composite type?

Pseudocode

create type axis(      major_axis float,      minor_axis float,      angle float constraint angle_constraint check(angle between -90 and 90)  );    create table sample(      axis1 axis,      axis2 axis  );  

Is this possible in PostgreSQL 9.2? It seems to be not possible in 9.1 as mentioned here.

MySQL Large DELETE with JOINs

Posted: 22 Jul 2013 05:13 PM PDT

I have a large table players with ~20MM rows which joins to a table stats with ~300MM rows, among others (photos, ...). I want to DELETE all records for players that were born_on before 1950. I should mention this is a Rails app, so I haven't properly constrained any of these relations with Foreign Keys (and I have a good number of indices on each of these tables, listed at the bottom).

[Edit] I would also like to delete all related data from stats, photos, etc. All-in-all, I would like to delete associations from about 10 tables, some of which are second-degree

I would expect to delete the players with a query like so [updated]:

DELETE FROM "players","stats","photos"  USING "players    LEFT JOIN "stats"      ON "players".id = "stats".player_id    LEFT JOIN "photos"      ON "players".id = "photos".player_id  WHERE "players".born_on < "1950-01-01"  

But, I'm running into an issue that that query, as it's a huge delete, is taking significantly too long and too many resources to complete [ever]. As I'm running the query with a JOIN, MySQL won't let me limit the DELETE to break the query into chunks.

Alternatively, I've tried deleting each row separately:

SELECT id FROM "players" where "players".born_on < "1950-01-01";  

Then

DELETE FROM "players" WHERE "players".id = 5;  DELETE FROM "stats" WHERE "stats".player_id = 5;  DELETE FROM "photos" WHERE "photos".player_id = 5;  ... repeat  

But the amount of parallelization necessary to complete these on say 10MM rows also throttles the database to 100% CPU (running m1.xlarge on Amazon RDS), rendering complete downtime for what would be several days of query.

My question is, what is the best way to delete these old rows from the database without incurring significant downtime for my application. Are there settings that could help, etc. that would make this simple and effective.

Please feel free to ask more questions about configuration, etc. as necessary to solve this problem. Thanks in advance for all of your help!

[Edit]

Schema

Players Table

CREATE TABLE `players` (    `id` int(11) NOT NULL AUTO_INCREMENT,    `first_name` varchar(255) DEFAULT NULL,    `middle_name` varchar(255) DEFAULT NULL,    `last_name` varchar(255) DEFAULT NULL,    `birth_date` datetime DEFAULT NULL,    `created_at` datetime DEFAULT NULL,    `updated_at` datetime DEFAULT NULL,    `team_id` int(11) DEFAULT NULL,    `jersey_name` varchar(255) DEFAULT NULL,    `home_city` varchar(255) DEFAULT NULL,    `coach_id` int(11) DEFAULT NULL,    PRIMARY KEY (`id`),    KEY `players_team_id_last_name` (`team_id`,`last_name`),    KEY `players_jersey_name` (`jersey_name`),    KEY `players_home_city` (`home_city`),    KEY `players_coach_id_index` (`coach_id`)  ) ENGINE=InnoDB AUTO_INCREMENT=611 DEFAULT CHARSET=utf8;  /*!40101 SET character_set_client = @saved_cs_client */;  /*!40101 SET @saved_cs_client     = @@character_set_client */;  /*!40101 SET character_set_client = utf8 */;  

Stats Table

CREATE TABLE `stats` (    `id` int(11) NOT NULL AUTO_INCREMENT,    `name` varchar(255) DEFAULT NULL,    `player_id` int(11) DEFAULT NULL,    `bucket_id` int(11) DEFAULT NULL,    `description` varchar(4096) DEFAULT NULL,    `meta1` longtext,    `meta2` longtext,    `created_at` datetime DEFAULT NULL,    `updated_at` datetime DEFAULT NULL,    `status` varchar(255) DEFAULT NULL,    `confidence` float DEFAULT NULL,    `viewed_at` datetime DEFAULT NULL,    `view_count` int(11) DEFAULT '0',    `reported_at` datetime DEFAULT NULL,    `reserved` tinyint(1) DEFAULT '0',    `ref_id` varchar(255) DEFAULT NULL,    PRIMARY KEY (`id`),    KEY `stats_player_id_bucket_id` (`player_id`,`bucket_id`),    KEY `stats_ref_id_player_id_bucket_id` (`ref_id`,`player_id`,`bucket_id`),    KEY `stats_status` (`status`)  ) ENGINE=InnoDB AUTO_INCREMENT=10322 DEFAULT CHARSET=utf8;  /*!40101 SET character_set_client = @saved_cs_client */;  /*!40101 SET @saved_cs_client     = @@character_set_client */;  /*!40101 SET character_set_client = utf8 */;  

Photos Table

CREATE TABLE `photos` (    `id` int(11) NOT NULL AUTO_INCREMENT,    `player_id` int(11) DEFAULT NULL,    `image` text,    `source` varchar(255) DEFAULT NULL,    `content_type` varchar(255) DEFAULT NULL,    `created_at` datetime DEFAULT NULL,    `updated_at` datetime DEFAULT NULL,    `photo_type` varchar(255) DEFAULT NULL,    `url` varchar(255) DEFAULT NULL,    `lat` decimal(15,10) DEFAULT NULL,    `lon` decimal(15,10) DEFAULT NULL,    `caption` varchar(255) DEFAULT NULL,    `place_name` varchar(255) DEFAULT NULL,    `href` varchar(255) DEFAULT NULL,    `posted_at` datetime DEFAULT NULL,    PRIMARY KEY (`id`),    KEY `photos_player_id` (`player_id`)  ) ENGINE=InnoDB AUTO_INCREMENT=313 DEFAULT CHARSET=utf8;  /*!40101 SET character_set_client = @saved_cs_client */;  /*!40101 SET @saved_cs_client     = @@character_set_client */;  /*!40101 SET character_set_client = utf8 */;  

Group By primary key or DISTINCT increase query time over 1000x with limit

Posted: 22 Jul 2013 12:29 PM PDT

Also see http://stackoverflow.com/questions/17741167/hibernate-jpa-improve-performance-of-distinct-query but I realized this is mainly a PostgreSQL issue.

My application uses a 3rd party extension to PostgreSQL to for searching chemical structures. This is in general slow. I can not change the SQL directly as the application uses hibernate and native query is not an option.

I have a many-to-many relationship and the "Link-table" has an additional column. Basically I have a Mixture that consists of elements and an element occurs in the mixture at a certain percentage. favorite

I use Spring-Data JPA with QueryDSL, hibernate and PostgreSQL. I have a query with 2 Joins It's a many too many with a link-table that has additional columns. Bascially I have a Mixture that consists of elements and an element occurs in the mixture at a certain percentage.

I'm now searching all Mixtures that contain an element matching the given criteria. Because a mixture can have multiple elements that match the criteria, the query may return the same entity multiple times. I want to prevent that hence DISTINCT or GROUP BY primary key.

The query is also paged meaning it uses limit and offset. The query runs perfectly fine without either distinct or group by but then I can get duplicate rows. If I add either group by or distinct query is over 1000 times slower.

Query with DISTINCT (note SQL from hibernate):

select distinct      simplecomp0_.chem_compound_id as chem1_0_,       --snipped about 10 more columns all short varchar or date fields      from simple_compound simplecomp0_       inner join compound_composition compositio1_                 on simplecomp0_.chem_compound_id=compositio1_.chem_compound_id       inner join chemical_structure chemicalst2_                 on compositio1_.chemical_structure_id=chemicalst2_.structure_id   where       chemicalst2_.structure_id  @ ('CCNc1ccccc1', '')::bingo.sub  limit 5   offset 5  

EXPLAIN ANALYZE with DISTINCT:

"Limit  (cost=5984.58..5984.63 rows=5 width=1645) (actual time=6342.541..6342.543 rows=5 loops=1)"  "  ->  HashAggregate  (cost=5984.53..5989.79 rows=526 width=1645) (actual time=6342.538..6342.542 rows=10 loops=1)"  "        ->  Nested Loop  (cost=0.00..5971.38 rows=526 width=1645) (actual time=7.289..6166.512 rows=128527 loops=1)"  "              ->  Nested Loop  (cost=0.00..4445.81 rows=526 width=8) (actual time=7.281..5694.663 rows=128527 loops=1)"  "                    ->  Index Scan using idx_chemical_structure on chemical_structure chemicalst2_  (cost=0.00..20.26 rows=526 width=8) (actual time=7.262..5013.620 rows=128508 loops=1)"  "                          Index Cond: (chemical_structure @ ROW('CCNc1ccccc1'::text, ''::text)::bingo.sub)"  "                    ->  Index Only Scan using compound_composition_pkey on compound_composition compositio1_  (cost=0.00..8.40 rows=1 width=16) (actual time=0.003..0.004 rows=1 loops=128508)"  "                          Index Cond: (chemical_structure_id = chemicalst2_.structure_id)"  "                          Heap Fetches: 128527"  "              ->  Index Scan using idx_pk on simple_compound simplecomp0_  (cost=0.00..2.89 rows=1 width=1645) (actual time=0.002..0.003 rows=1 loops=128527)"  "                    Index Cond: (chem_compound_id = compositio1_.chem_compound_id)"  "Total runtime: 6344.584 ms"  

also http://explain.depesz.com/s/ocC

The long time is caused by searching the 3rd party index for chemical structure search. For some reason the whole indexed is searched.

If the distinct is removed, limit and offset are correctly applied to the 3rd part index and query is fast:

EXPLAIN ANALYZE:

"Limit  (cost=56.76..113.52 rows=5 width=1645) (actual time=11.135..11.472 rows=5 loops=1)"  "  ->  Nested Loop  (cost=0.00..5971.38 rows=526 width=1645) (actual time=10.783..11.469 rows=10 loops=1)"  "        ->  Nested Loop  (cost=0.00..4445.81 rows=526 width=8) (actual time=10.774..11.406 rows=10 loops=1)"  "              ->  Index Scan using idx_chemical_structure on chemical_structure chemicalst2_  (cost=0.00..20.26 rows=526 width=8) (actual time=10.755..11.313 rows=10 loops=1)"  "                    Index Cond: (chemical_structure @ ROW('CCNc1ccccc1'::text, ''::text)::bingo.sub)"  "              ->  Index Only Scan using compound_composition_pkey on compound_composition compositio1_  (cost=0.00..8.40 rows=1 width=16) (actual time=0.006..0.006 rows=1 loops=10)"  "                    Index Cond: (chemical_structure_id = chemicalst2_.structure_id)"  "                    Heap Fetches: 10"  "        ->  Index Scan using simple_compound_pkey on simple_compound simplecomp0_  (cost=0.00..2.89 rows=1 width=1645) (actual time=0.004..0.004 rows=1 loops=10)"  "              Index Cond: (chem_compound_id = compositio1_.chem_compound_id)"  "Total runtime: 12.052 ms"  

Also http://explain.depesz.com/s/gU2

Is there anyway I can tune PostgreSQL to apply the 3rd party index only according to the limit and offset clause when using distinct?

EDIT:

After further thinking about the issue I came to the conclusion that there is no solution that I can implement. With group by or distinct the whole query obviously must be run regardless of limit clause. And if the whole query is run the 3rd party index must be used an that takes time (without that index such a search would take minutes not seconds).

Now about statistics. Here a quote from supplier:

The cost for structure index for operator @ is underestimated and hard coded, for Postres > to use index almost in all cases. Again, as I mentioned, the cost-estimation function is > not implemented yet (we'll let you know when it's ready).

Compare two Oracle database roles

Posted: 22 Jul 2013 02:39 PM PDT

I'm trying to re-engineer my corporates user managament and want to compare all the roles with each other to find duplicate ones. Doing it by hand is really tiring and I need to compare like 150 roles. I wrote the following script:

DEFINE ROLE_1 = &INPUT_1  DEFINE ROLE_2 = &INPUT_2    SET FEEDBACK OFF  SET VERIFY OFF  SET HEADING ON  SET LINESIZE 140    COLUMN PRIVILEGE    FORMAT A20    HEADING "PRIVILEGE"  COLUMN TABLE_NAME   FORMAT A30    HEADING "TABLE_NAME"  COLUMN GRANTABLE    FORMAT A5     HEADING "GRANTABLE"    PROMPT  PROMPT OVERLAPPINGS BETWEEN 1st ROLE AND 2nd ROLE  PROMPT    SELECT      PRIVILEGE,      TABLE_NAME,      GRANTABLE  FROM      SYS.DBA_TAB_PRIVS  WHERE      GRANTEE IN ( '&ROLE_1' )    INTERSECT    SELECT      PRIVILEGE,      TABLE_NAME,      GRANTABLE  FROM      SYS.DBA_TAB_PRIVS  WHERE      GRANTEE IN ( '&ROLE_2' )  ORDER BY TABLE_NAME, PRIVILEGE ASC  /    PROMPT  PROMPT PRIVILEGES IN 1st ROLE BUT NOT IN 2nd ROLE  PROMPT    SELECT      PRIVILEGE,      TABLE_NAME,      GRANTABLE  FROM      SYS.DBA_TAB_PRIVS  WHERE      GRANTEE IN ( '&ROLE_1' )    MINUS    SELECT      PRIVILEGE,      TABLE_NAME,      GRANTABLE  FROM      SYS.DBA_TAB_PRIVS  WHERE      GRANTEE IN ( '&ROLE_2' )  ORDER BY TABLE_NAME, PRIVILEGE ASC  /    PROMPT  PROMPT PRIVILEGES IN 2nd ROLE BUT NOT IN 1st ROLE  PROMPT    SELECT      PRIVILEGE,      TABLE_NAME,      GRANTABLE  FROM      SYS.DBA_TAB_PRIVS  WHERE      GRANTEE IN ( '&ROLE_2' )    MINUS    SELECT      PRIVILEGE,      TABLE_NAME,      GRANTABLE  FROM      SYS.DBA_TAB_PRIVS  WHERE      GRANTEE IN ( '&ROLE_1' )  ORDER BY TABLE_NAME, PRIVILEGE ASC  /  

It works all fine, but for some reason it doesn't display the table head.

The real problem why I'm asking is, is there a more comfortable way to automatically compare all roles to one another? Some neat way to do all this? I'm quite new to PL/SQL and don't really know how to loop this thing up. Is there even maybe a possibly way to display the corresponding column of the tables?

Restore to last transaction log restore

Posted: 22 Jul 2013 12:31 PM PDT

We have a log shipping setup where we replicate from our primary database to a secondary. Now that database is, by the way of log shipping, in standby and thus read-only. We set it up so we could run reports from it, but it seems the reporting software used needs to go through the application server which needs to log onto the server - in this case, Microsoft Dynamics AX and Microsoft Dynamics ERP.

Our plan is to have a teritary database - this would also get log backups via log shipping, but in a different way - we would like to roll the database back to what it was at the last transaction log restore, then restore the latest log file(s). Is this possible, and how would we go about it?

EDIT 2013/07/21:

Allow me to rephrase my question. If I have a SQL database with full transaction logging, can I tell it to roll back all changes to a certain point, and if so, how may I accomplish this?

Thanks

Speeding up MySQL NOT IN (Subquery)

Posted: 22 Jul 2013 04:25 PM PDT

I'm currently up to try to improve the performance of some of my queries. As far as I know, statements like "IN" or "NOT IN" are even faster with a large amount of values if a subquery on an indexed field without conditions is used.

SELECT * FROM table1 WHERE field1 NOT IN (SELECT index_field FROM table2)

When it comes to use conditions at table2, the query becomes realy slow on a large amount of data.

SELECT *  FROM table1  WHERE      field1 NOT IN (          SELECT              index_field          FROM table2          WHERE              user_id = '2'      )  

I was thinking about using LEFT Join but when I need to to the filtering for the user_id, it also becomes slow.

How may I solve this problem? Currently, I've no clue about that.

Kind regards,

Dominik

//--- EDIT -->

The original query looked sth. like this

SELECT      i.*  FROM stream_item si  LEFT JOIN user__publisher_item ui      ON ui.user_id = 2      && ui.item_id = si.item_id  INNER JOIN item i      ON i.id = si.item_id  WHERE       si.stream_id IN (5,7,8,9,19,24,29,42,43,44)      && ui.id IS NULL || (          ui.`read` = 0 || ui.saved = 1      )  GROUP BY      si.item_id  ORDER BY      si.`found` DESC  LIMIT 0,10  

I have a table which represents the n:m relationship between stream and items. An unique item can be accessible via multiple streams. Based upon this i have a state-table for user dependent states of an item. Currently, the user-state-entry will only be generated, if the item was read or saved the first tim. Otherwise, there is no entry in the user_item table. In the beginning, this query was fine but now, it takes a lot of time. Even if I leave out the "INNER JOIN" it takes 7 seconds on the current tables with around 500k rows each.

//--- Edit 2 --> The table-structure in the background is as follows:

  • stream
  • stream_item //n:m relation between stream and items
  • item //unique items
  • stream_group //user-defined groups of multiple streams
  • stream_group_streams //_n:m relation between stream_groups and streams_
  • user
  • user_item //state-table that holds the user-dependent states of an item - if there is no state, no row exists for an item

btw. sorry for my bad english

Cannot connect to server. Too many symbolic links. Postgres Error

Posted: 22 Jul 2013 12:35 PM PDT

I get this error when I try connecting to a PostgreSQL server:

psql: could not connect to server: Too many levels of symbolic links      Is the server running locally and accepting      connections on Unix domain socket "/var/pgsql_socket/.s.PGSQL.5432"?  

What can I do to fix this?

I have Postgres installed on my Mac.

Transaction Log maintenance when switching to Simple Recovery

Posted: 22 Jul 2013 02:07 PM PDT

Background:

I recently inherited 50+ SQL Servers with 450+ databases. The nightly backups are roughly 8TB and, needless to say, we're using more disk space than we'd like. All of the databases are set to FULL recovery and the transaction logs have never been backed up. I've gone through all of the SQL Servers and identified low priority ones that only need a nightly backup and a where a day of data loss is acceptable.

Question:

I'm switching a lot of low priority databases to SIMPLE recovery mode from FULL. Will the existing transaction logs be truncated (when checkpoints are created)? Some of the existing transaction logs are 50-100GBs; what is the best approach in determining what I should shrink them down to for the purposes of moving forward? I obviously don't want to keep them that large. Or, will they shrink on their own over time (I don't think they will)?

Filter on a window function without writing an outer SELECT statement

Posted: 22 Jul 2013 01:35 PM PDT

Since window functions cannot be included in the WHERE clause of the inner SELECT, is there another method that could be used to write this query without the outer SELECT statement? I'm using Oracle. Here is the sqlfiddle.

SELECT MERCHANTID, WAREHOUSEID, PRODUCTCODE  FROM (    SELECT 0    , WMP.MERCHANTID    , WMP.WAREHOUSEID    , WMP.PRODUCTCODE    , RANK() OVER (PARTITION BY ML.MASTERMERCHANTID, WMP.PRODUCTCODE ORDER BY ML.PREFERENCEORDER ASC NULLS LAST) MERCH_RANK    , RANK() OVER (PARTITION BY WMP.MERCHANTID, WMP.PRODUCTCODE ORDER BY WM.PREFERENCEORDER ASC NULLS LAST) WARE_RANK    FROM MW_WAREHOUSEMERCHANTPRODUCT WMP      LEFT OUTER JOIN MW_WAREHOUSEMERCHANT WM ON 0=0                  AND WMP.MERCHANTID  = WM.MERCHANTID                  AND WMP.WAREHOUSEID = WM.WAREHOUSEID      LEFT OUTER JOIN MW_MERCHANTLINK ML ON 0=0                  AND WMP.MERCHANTID = ML.LINKEDMERCHANTID      LEFT OUTER JOIN MW_WAREHOUSEMERCHANTPRODUCT MASTER ON 0=0                  AND ML.MASTERMERCHANTID = MASTER.MERCHANTID                  AND WMP.PRODUCTCODE     = MASTER.PRODUCTCODE    WHERE 0=0      AND WMP.STOCKLEVEL > 0      AND NVL(MASTER.STOCKLEVEL, 0) <= 0  )  WHERE 0=0    AND MERCH_RANK = 1    AND WARE_RANK  = 1  ;  

How do I identify the remote db agent name to use in create_database_destination on Oracle 11gR2?

Posted: 22 Jul 2013 07:35 PM PDT

I am trying to setup DBMS_SCHEDULER in Oracle 11g to run a remote database job.

I have a remote Oracle 11g R2 database on unix and a local one on Windows.

I read that you can install the oracle scheduler agent from the 11g client install for machines that don't have Oracle installed but this is not needed for running remote jobs if Oracle is present on both machines. With the remote agent installation, you run schagent and provide parameters to register the agent to the remote machine but I cant find any instructions on the web regarding how to register remote agents when both machines have Oracle installed or what to use as the agent name in this case.

I have added an entry to tnsnames.ora for the remote DB and can tnsping, etc.

If I run the dbms_scheduler.create_database_destination procedure, it requires an agent name but where can I find this for the database or how can I check that it is running on Unix or Windows?

How can Innodb ibdata1 file grows by 5X even with innodb_file_per_table set?

Posted: 22 Jul 2013 06:35 PM PDT

I have innodb_file_per_table set and just today my ibdata1 file jumped from 59M to 323M after I made several changes to an 800M table to reduce it to about 600M. That particular table's .ibd file was reduced but the server's ibdata1 file went crazy. Any ideas?

Is there slowdown inserting into an InnoDB table that has no index set?

Posted: 22 Jul 2013 08:35 PM PDT

I have an old application with lots of InnoDB tables, that have no indexes at all, not even a primary ID or such.

Those tables only contain a few thousand rows.

Would it be faster to INSERT data into these tables if I would set a primary index (that I don't need otherwise)?

Loading XML documents to Oracle 11g DB with control file

Posted: 22 Jul 2013 03:35 PM PDT

I am using Oracle 11g XML database and trying to load XML documents to this DB with a control file and the sqlldr utility. All these XML files have an element that contains a date string with time stamp (and the letter T in the middle of it). Oracle rejects this date string because of T in it and thus the XML file is not loaded to the DB.

I want to use the Oracle function TO_TIMESTAMP_TZ on the date string during the data load, but I do not know how to do it. That's where I need help. If there is any other way to import the XML (with date string with timestamp), I will try that also.

Here is the date entry in XML file:

<ns3:EntryDateTime cls="U">2013-04-20T21:02:52.468-04:00</ns3:EntryDateTime>  

And here is entire code the control file:

load data infile 'filelist.dat'     append into table STXP xmltype(XMLDATA)    ( filename filler char(120), XMLDATA lobfile(filename) terminated by eof )  

I believe that I can execute the above control file with the sqlldr utility on SQL*Plus command line also, but not sure about this option. If this is possible, I guess I can ALTER SESSION (to somehow format date string) on command line before executing the control file.

The filelist.dat mentioned above contains entries for input XML file, with one line listing one XML file. The above date entry is required in each XML file. Each XML file has about 50 different elements, some required and some optional. I would greatly appreciate your help.

UPDATE: I successfully registered the schema, which contains definition for the date string, and 100 other schema, with a script. Since this script is very large, I am posting only 2 registration portions of it:

DECLARE  SCHEMAURL VARCHAR2( 100 );  SCHEMADOC VARCHAR2( 100 );  BEGIN  SCHEMAURL := 'http://www.some.org/stxp/DataTypes.xsd';  SCHEMADOC := 'DataTypes.xsd';  DBMS_XMLSCHEMA.registerSchema(       SCHEMAURL,       BFILENAME( 'XSD_DIR', SCHEMADOC ),      LOCAL => TRUE, -- local      GENTYPES => TRUE,  -- generate object types      GENBEAN => FALSE, -- no java beans      GENTABLES => TRUE,  -- generate object tables      OWNER => USER );      SCHEMAURL := 'http://www.some.org/stxp/STXP.xsd';      SCHEMADOC := 'STXP.xsd';      DBMS_XMLSCHEMA.registerSchema(       SCHEMAURL,       BFILENAME( 'XSD_DIR', SCHEMADOC ),      LOCAL => TRUE, -- local      GENTYPES => TRUE,  -- generate object types      GENBEAN => FALSE, -- no java beans      GENTABLES => TRUE,  -- generate object tables      OWNER => USER );    END;    /  

The 2nd registration above is the last in the script, and this creates the table STXP, in which I am trying to load about 800 XML files. Each XML file has a root element called stxp.

This is the relevant definition of date string:

 <xsd:simpleType name="DT" xdb:SQLType="TIMESTAMP WITH TIME ZONE">      <xsd:restriction base="xsd:dateTime"/>  </xsd:simpleType>  

And this is how I am using the above definition:

<element name="EntryDateTime" type="oth:DT"/>  

When I make the above element optional (for testing purpose) and remove the date string entry (mentioned near the top of this question) from my XML file, the XML file is loaded successfully to Oracle XML database. When I put this entry back to XML file (because it is required), Oracle rejects it.

Because I let Oracle take care of population of STXP table with data from XML files, I am not sure if I can set a trigger to pre-process the date string from the input XML file before saving it in database. i think there is a way to do it in the control file.

MySQL - run SELECT statement on another server without defining the table structure

Posted: 22 Jul 2013 02:35 PM PDT

In MySQL I can query information on another server using federated tables, as long as I've defined the same table structure locally.

In MS SQL Server, however, I can run any SQL statement against a linked server. Is it possible to do the same thing in MySQL?

consequences of using "innodb_flush_method = O_DIRECT" without having a battery backed write cache? or on a KVM guest?

Posted: 22 Jul 2013 05:35 PM PDT

Mysql 5.5.29 Innodb- 128GB Ram - 32 cores - Raid 10 SSD.

Our server which is a dedicated KVM guest on a 'baremetal' is hosting our heavy read-write DB server. Everything is file-per-table. innodb_Buffer_pool is 96GB with 1GBx2 log_file_size with about 20 minutes of writes to fill up those logs at peak time.

How bad of a situation would it be if O_DIRECT (currently running on the default) was enabled during a high work load without a battery backed write cache and a total crash were to occur on the OS, parent host or the power was cut?

Does a battery backed write cache make a difference if the server is a vm guest of the parent anyway?

.

How to I grant a user account permission to create databases in postgresql?

Posted: 22 Jul 2013 03:16 PM PDT

Title says it all. I cannot figure out how to give a user account the ability to create and drop databases. Is there a way to do this with GRANT ?

How can I set a default session sql_mode for a given user?

Posted: 22 Jul 2013 04:35 PM PDT

I want to have a default session sql_mode that is different from the global sql_mode.

I can set it manually, but I'd like to do it automatically.

I thought I could add it to .my.cnf, but that doesn't work. I tried adding it to the [mysql] group in .my.cnf, and it does not throw an error.

Yet, when I connect my session sql_mode, the session still inherits the global sql_mode.

I'm using MySQL 5.5.16.

Tuning dedicated Percona Server with XtraDB for OLTP

Posted: 22 Jul 2013 02:34 PM PDT

Hardware specs for an OLTP DB

  • dual quad core xeon with hyperthreading
  • 24GB ram
  • Hardware RAID10 BBU

Software specs

  • Standard install of Ubuntu 10.04LTS with default LVM partitioning
  • Installed Percona Server with XtraDB

My priority is data integrity/protection followed closely by performance.

QUESTIONS

  • Are there settings in the RAID card that should be or should not be set? Since I have BBU, write-back should be set right?
  • Is ext4 ok or should I create a separate xfs volume for the data?
  • What should the kernel scheduler be set to? noop? deadline?
  • Are there other settings that I should tweak at the Hardware/OS/Filesystem level?

Duplicate column for faster queries?

Posted: 22 Jul 2013 03:22 PM PDT

The title doesn't make too much sense, but I couldn't think a better title for this problem.

I have the following tables

Projects

  • id
  • name

Customers

  • id
  • id_project
  • name

Payments

  • id
  • id_customer
  • date
  • sum

When a users enters the system, he will have access to a certain project. Now, I want to list all the payments for that project, and it should be pretty easy:

SELECT FROM payments where id_customer in (SELECT id from customers where id_project = 5)  

My question is : if it isn't better to add a column id_project to payments table this way the queries will be easier and faster.

No comments:

Post a Comment

Search This Blog