Friday, April 5, 2013

[how to] Why is there such a huge performance difference in these two join statements?

[how to] Why is there such a huge performance difference in these two join statements?


Why is there such a huge performance difference in these two join statements?

Posted: 05 Apr 2013 06:00 PM PDT

I have a query which i first wrote as (query has been simplified a bit) :

select ROW_NUMBER() OVER (ORDER BY c.CategoryTypeID), c.name, count(p.id) from category c  inner join product p on p.indID = c.id or p.catid = c.id and p.deleted = 1 and p.statusid = 1   group by c.name  

This took 8 seconds.

If i changed it to the following it runs in under 1 second:

   select ROW_NUMBER() OVER (ORDER BY c.CategoryTypeID), c.name, count(p.id) from category c     inner join product p on p.indID = c.id or p.catid = c.id     where p.deleted = 0 and p.statusid =1   group by c.name  

I'd like to understand why there's such a huge difference in performance. Not only is the performance worse but it seems to be ignoring my conditional statements on the first sql query it's counting records regardless of what p.deleted or p.statusid is

What I am not understanding about how joins work? This is being run on mssql 2012

How do I find an invalid utf8 character "somewhere" in my MySQL/trac database

Posted: 05 Apr 2013 03:47 PM PDT

I have an installation of trac, which uses MySQL. It has been upgraded so many times, moved servers etc. and chances are that the MySQL character sets were not always set correctly or consistently over the years. Currently all of them are utf8.

When attempting to upgrade the data using "trac-admin wiki update", I'm getting an error message that a byte ("UnicodeDecodeError: 'utf8' codec can't decode byte 0xa0 in position 4274: invalid start byte") is not valid unicode. Unfortunately trac-admin gives me no hint where (table/row/column) to look for that byte sequence, or what I could do to fix it.

My question is not about trac/trac-admin, however, it's about the database. How would you go about finding, "somewhere" in the database, the offending bytes, and replacing them with something that is at least valid utf8. I have attempted to mysqldump the database and to re-import it, but MySQL gives no indication that anything might be wrong. The invalid bytes get re-imported.

Ideas?

Adding slave to existing master-master config in mysql throws Foreign key constraint failed error

Posted: 05 Apr 2013 07:22 PM PDT

We have two mysql servers running in master - master configuration.

Now we have to add a slave to the existing configuration.

But upon adding the third DB server and starting slave on it, it throws Cannot add or update a child row. foreign key constraint fails error.

We have tried taking a mysqldump from the master with --master-data and restored it. After that we started the slave on the slave DB. But even then, it says the same error. We have also tried XtraBackup. But that also throws the same error. Are we missing something?

EDIT 1

mysqldump command: mysqldump --master-data -uroot -p dbname > dbname.sql

We have also tried it this way: percona xtrabackup

In both cases, upon getting the foreign key error, we tried to dump and restore individual tables referenced by the foreign keys manually from the master to the slave. Upon doing this, the replication starts and seems to work normally with 0 seconds behind master for a few minutes, after which another foreign key error shows up, stopping the replication.

EDIT 2

auto_increment_increment is set to 2 in both masters. auto_increment_offset is set to 1 in Master 1 and 2 in Master 2. We have mostly innoDB tables and some MyISAM tables.

MySQL version 5.5.30.

Transform XPath map into XML document using relational data

Posted: 05 Apr 2013 04:37 PM PDT

Background

Most modern databases have XML functions that can be used to extract data in an XML format. I want to avoid the task of manually calling XML functions to extract the data.

This problem involves devising a generic solution to create XML documents based on mapping database tables (and JOIN conditions) to XPath expressions.

Problem

An XPath map codifies associations between an XML document and relational data as follows:

root               > people  person             > person  person.first_name -> name/first  person.last_name  -> name/last  person.age        -> [@age]  account.person_id => person.person_id  account            > person/account  account.number    -> [@id]  

Where a PERSON table might resemble:

person_id | first_name | last_name | age        123 | Peter      | Parker    | 18        456 | James      | Jameson   | 42  

Where an ACCOUNT table might resemble:

account_id | person_id | number           1 |       123 | 123456789  

Calling a function using the XPath map would produce the following XML document:

<people>    <person age="18">      <name>        <first>Peter</first>        <last>Parker</last>      </name>      <account id="123456789" />    </person>    <person age="42">      <name>        <first>James</first>        <last>Jameson</last>      </name>    </person>  </people>  

In this case, James Jameson does not have an account and so the corresponding XML element (account) is not included in the final document.

This is a difficult problem and a complete solution is not necessary. A solution that handles 80% of simple tables mapped to simple XML elements and attributes would suffice.

Question

What algorithm would return an XML document based on such a generic XPath map?

The algorithm must transform the structure defined by the XPath map into an XML document with the content from the relations defined in the XPath map.

Are there any technologies, or open source implementations, that already perform such a task?

Related Links

Links that are somewhat related to this idea.

Articles and White Papers

Articles:

Commercial Software

Similar solutions:

Database Install Issue

Posted: 05 Apr 2013 02:40 PM PDT

I was navigated over here from Stack Overflow, hopefully you can help me out.

So I've got a dilemma. Our company has bought a volume license for Microsoft SQL Server 2012 Standard Edition. We purchased the per core license scheme. All of that didn't have any issues; we even installed it without any problems.

The issue is coming from Microsoft though. When I configured our Volume License Account they said:

We will issue a special installation file that will not require a key for usage. In order to install this to the proper processor allocation you purchased you'll need to install the file that many times.

So from what I've gathered, a Microsoft SQL Server 2012 Standard Edition Per-Core License by default is configured as a dual-core. So I understand if I have a quad-core I have to purchase it twice. What I don't understand, why would I need to install it two times?

I can't find any documentation on this.

That process doesn't seem correct, is there a way to test how many cores SQL is running on? Any assistance would be terrific-

Things I've Tried:

  • SQL Installation Guide
  • SQL Getting Started
  • Calling Microsoft

Some how I feel this issue shouldn't be nearly as complicated as it has become.

Those haven't let me come any closer to my goal, I get the notion that I should only have to install it once. But with what that Microsoft Licensing Technician said I suddenly have a cloud of doubts over my head.

Thanks again.


Update:

So within our Microsoft Volume License Account this is all it provides with our download:

No Key Required. Description

Which I feel is adding to this confusion. We have it installed on the Server; but I'm not entirely sure if I can check to see that it is running all cores. The receipt shows we've purchased multiple licenses-

If I understand the answer posted; it should be built into our license file from the installation. Is there a way to check that?

Turning on TF610 in SSIS

Posted: 05 Apr 2013 11:44 AM PDT

I have a simple SSIS package that loads a dummy file that contains 100000 rows. Each row is around 4k long, one int column and one long text column. I'm trying to test TF610 when loading this data into a table with a clustered index.

In my SSIS package, my Control Flow has a Execute SQL task to enable TF610, then on success go to my Data Flow Task which loads the flat file into the table. Both the Execute SQL and OLE DB Destination use the same Connection.

SSIS Package

If I start a profile while running the SSIS package, and watch the commands, I can see DBCC TRACEON(610) executed then the INSERT BULK operations begin to fire. They both are using the same PID, so I'm assuming it's the same session.

Profile

When I check the log record length though, the insert is NOT being minimally logged.

If I enable TF610 globally and run the same SSIS package though the transaction is minimally logged.

I must be doing something wrong when turning on the TF610 in my SSIS package but I can't figure out what...

Should all queries where you expect a specific order, include an ORDER BY clause?

Posted: 05 Apr 2013 12:01 PM PDT

In order to better understand the SQL Server query processor, I've been thinking about how the ORDER BY clause works, and how SQL Server provides results.

It appears SQL Server will provide results in exactly the same order for any given set of data that remains the same and never, ever changes. Once you introduce any kind of uncertainty such as developers changing something, the order of results can no longer be expected to be the same.

Simply seeing the results in the same order, each time you press F5 does not guarantee the results you expect.

Try this:

USE TempDB;  DROP TABLE TestOrderBy;  DROP TABLE TestOrderBy2;  CREATE TABLE TestOrderBy  (      ID INT NOT NULL CONSTRAINT PK_TestOrderBy PRIMARY KEY CLUSTERED IDENTITY(1,1)      , SomeData varchar(255)  );  INSERT INTO TestOrderBy (SomeData) VALUES ('E');  INSERT INTO TestOrderBy (SomeData) VALUES ('D');  INSERT INTO TestOrderBy (SomeData) VALUES ('C');  INSERT INTO TestOrderBy (SomeData) VALUES ('B');  INSERT INTO TestOrderBy (SomeData) VALUES ('A');    CREATE TABLE TestOrderBy2  (      ID INT NOT NULL CONSTRAINT PK_TestOrderBy2 PRIMARY KEY CLUSTERED IDENTITY(1,1)      , SomeData varchar(255)  );  INSERT INTO TestOrderBy2 (SomeData) VALUES ('E');  INSERT INTO TestOrderBy2 (SomeData) VALUES ('D');  INSERT INTO TestOrderBy2 (SomeData) VALUES ('C');  INSERT INTO TestOrderBy2 (SomeData) VALUES ('B');  INSERT INTO TestOrderBy2 (SomeData) VALUES ('A');    SELECT SomeData   FROM TestOrderBy  UNION ALL  SELECT SomeData  FROM TestOrderBy2;    CREATE INDEX IX_TestOrderBy_SomeData ON TestOrderBy (SomeData);  CREATE INDEX IX_TestOrderBy2_SomeData ON TestOrderBy2 (SomeData);    SELECT SomeData   FROM TestOrderBy  UNION ALL  SELECT SomeData  FROM TestOrderBy2;  

The results:

enter image description here

As you see, adding a simple index on the fields selected in the query alters the order of the results.

From here on, I'm adding ORDER BY unless I truly don't care.

How to JOIN two table to get missing rows in the second table

Posted: 05 Apr 2013 02:27 PM PDT

In a simple voting system as

CREATE TABLE elections (  election_id int(11) NOT NULL AUTO_INCREMENT,  title varchar(255),    CREATE TABLE votes (  election_id int(11),  user_id int(11),  FOREIGN KEYs  

for getting the list of elections a user has voted, the following JOIN is used

SELECT * FROM elections  JOIN votes USING(election_id)  WHERE votes.user_id='x'  

but how to get the list of elections a user has NOT voted?

Create a constant in Postgresql [migrated]

Posted: 05 Apr 2013 12:20 PM PDT

Suppose that I have this query:

select *   from myTable  where myTable.myCol in (1,2,3)  

I would like to do that:

with allowed_values as (1,2,3)  select *   from myTable  where myTable.myCol in allowed_values  

It gives me a Syntax Error in the first row, can you help me fixing it?

"Site Offline" MySQL server failing to start and stop

Posted: 05 Apr 2013 04:55 PM PDT

I'm hosting a few sites on linode and have encountered a strange problem. For the last two years the site has been running perfectly fine, and now randomly all the websites on the server go to the "site offline check the settings.php" page. No changes have been made to the website at all recently.

when i try go do mysqld stop it says busy for a while then finally says failed, while doing start results in an instant fail.

Many of the threads I googled suggest the hostname is off in the settings.php but since the website has been up for two years I don't think this can be the case.

I haven't had to troubleshoot mySQL before, but if i remember correctly the log file I should be looking at is Hostname.err. In there, it has a large chain of errors concerning InnoDB. The one listed as fatal error says:

InnoDB: Fatal error: cannot allocate memory for the buffer pool

I would appreciate any suggestions, and if there are log files that would help let me know.

Edit: Requested information. 1) When I examine the CNF, it appears that all lines involving innoDB are commented with #. This means that it has always been this way as I have not modified it.

2)mysql Ver 14.14 Distrib 5.5.14, for Linux (i686) using readline 5.1

3) It looks like there is a giant jump in the IO rate every time I try to start the database / before the websites die. How do i go about clearing those cache tables if they have become too unmanageable and is there any risk of losing anything?

Third Edit (Now with datadir):

How much RAM is on the VM ? 512MB

How much space does datadir have ?

Size: 20G Used: 19G Available 0 Use Percent: 100%

What is the size of ibdata1 ( ls -lh /var/lib/mysql/ibdata1)? 114M

FYI: If anyone looks here with the same problem in the future, My issue was the the binary log files had consumed my entire disk space. See here http://systembash.com/content/mysql-binary-log-file-size-huge/

How to recover data from corrupted SQL Server database?

Posted: 05 Apr 2013 01:00 PM PDT

We had several power outages and server rebooted couple times which seemed to cause issues with one of the databases. We tried detaching at attaching database again but it looks like database is corrupted and we're getting the

Msg 5172, Level 16, State 15, Line 1

Claiming that database header is invalid. Is there anything we can to do repair database or extract data from MDF file? We do have a backup but it's like 2 weeks old and doesn't contain all data.

Idle connection plus schema-modifying query causing locked database

Posted: 05 Apr 2013 05:26 PM PDT

As part of our automated deployment process for a web app running on a LAMP stack, we drop all our triggers and stored procedures and recreate them from source control. It turns out there was a hidden danger to this approach that we hadn't thought about.

A few days ago we managed to end up with the database for (the staging version of) our web app stuck in a horribly hung state after the following sequence of events:

  1. I connect to the remote database from our office (via Python's MySQLdb, as it happens) and run a few SELECT queries on the Foo table.
  2. I leave the connection open, because I'm lazy.
  3. My boss commits some changes on his laptop, pushes to the remote repo on the web server, and goes to lunch without looking at the output
  4. The deployment hook on the web server tries to update the triggers and stored procedures in the database, but isn't able to even DROP the first trigger because the trigger involves the Foo table, which my currently sleeping connection had previously done some SELECTs from.
  5. Now nobody can SELECT from the Foo table at all, because the connection trying to DROP the trigger has already taken out a lock on the Foo table that prevents any other connections from accessing the Foo table in any way - even though it's still waiting for the sleeping connection to be closed before it can actually do anything.
  6. Crucial business processes relying upon the Foo table grind to a halt, alarms sound, and our web app stops serving customers. My boss flies into a rage and declares that heads will roll if the cause of the problem is not found and fixed so that this can never happen again. (Just kidding, it was only our staging server and my boss is very friendly.)

What's interesting is that this scenario wasn't caused by any kind of deadlock; it was caused by a sleeping connection implicitly holding some kind of lock that prevented the DROP TRIGGER statement from executing, just by virtue of having done a SELECT on the same table previously. None of the anti-deadlock features of MySQL could automatically kill a process and save the situation, because ultimately everything could continue as soon as my original process - the idle one that had only ever done SELECTs - was killed. The fact that MySQL locks behave this way by default seems perverse to me, but that's not the point. I'm trying to figure out a way to ensure that the disaster scenario described above can't ever recur (especially on our live server). How would you suggest I do this?

We've talked the problem over in the office, and there are a couple of hypothetical solutions we saw:

  • Change some config setting somewhere so that sleeping processes time out after 10 seconds by default, so that a sleeping process can never sit on locks. Better yet, have them just release all locks after 10 seconds so that I can still go to lunch and leave my MySQL shell open, or my Python window open with a MySQLdb connection active, then come back and use it, without fear of breaking anything.

    • This might be really irritating when trying to run queries manually, especially ones that require grouping into a transaction.
  • Work some magic on the queries that try to replace the triggers and stored procedures so that the acquisition of locks required for the relevant DROPs and CREATEs is made into an atomic operation - something like, if the query can't acquire all the locks it needs immediately in sequence, then it releases them and tries again periodically until it works.

    • This might just make our deployment process never complete, though, if the database is too busy for it to be able to grab all the locks in one go.
  • Drastically reduce the frequency of schema-modifying queries we make (it only seems to be these that can be blocked from starting by a connection that's only done SELECTs), for instance by having our deployment script check whether a stored procedure or trigger in source control has changed from the version in the database before DROPping and reCREATEing the one on the database.

    • This only mitigates the problem, it doesn't actually eliminate it.

We're not sure if either of the first two solutions we considered are even possible in MySQL, though, or if we're missing a better solution (we're developers, not DBAs, and this is outside of our comfort zone). What would you recommend?

ibdata1 grows exponentially when innodb_file_per_table is configured

Posted: 05 Apr 2013 12:20 PM PDT

I Have installed a MySQL Cluster with InnoDB (innodb_file_per_table enabled subsequently), but since I switched to innodb_file_per_table, the file ibdata1 grows (2GB at month).

Is My my.cnf file is correct?

Why my ibdata1 is so big (22GB)?

How I can look what there is in the ibdata1?

Server Configuration:

  • Debian 6 amd64

  • mysql-client-5.1 5.1.61-0+squeeze1

My InnoDB Configuration File:

#  # * InnoDB  #  # InnoDB is enabled by default with a 10MB datafile in /var/lib/mysql/.  # Read the manual for more InnoDB related options. There are many!  #    innodb_data_home_dir            = /var/lib/mysql  innodb_data_file_path           = ibdata1:10M:autoextend  innodb_log_group_home_dir       = /var/lib/mysql  innodb_file_per_table  innodb_buffer_pool_size         = 5G  innodb_additional_mem_pool_size = 48M  innodb_log_files_in_group       = 3  innodb_log_file_size            = 512M  innodb_log_buffer_size          = 8M  innodb_flush_log_at_trx_commit  = 0  innodb_lock_wait_timeout        = 50  innodb_thread_concurrency       = 15   innodb_flush_method             = O_DIRECT  

Procedure:

1) Dump All DB's  2) Stop MySQL  3) Add "innodb_file_per_table"  4) Delete all ib* file  5) Start MySQL  6) Import All DB's  

InnoDB Conf:

mysql> show variables like 'innodb%';  +-----------------------------------------+------------------------+  | Variable_name                           | Value                  |  +-----------------------------------------+------------------------+  | innodb_adaptive_hash_index              | ON                     |  | innodb_additional_mem_pool_size         | 50331648               |  | innodb_autoextend_increment             | 8                      |  | innodb_autoinc_lock_mode                | 1                      |  | innodb_buffer_pool_size                 | 25165824000            |  | innodb_checksums                        | ON                     |  | innodb_commit_concurrency               | 0                      |  | innodb_concurrency_tickets              | 500                    |  | innodb_data_file_path                   | ibdata1:10M:autoextend |  | innodb_data_home_dir                    | /var/lib/mysql         |  | innodb_doublewrite                      | ON                     |  | innodb_fast_shutdown                    | 1                      |  | innodb_file_io_threads                  | 4                      |  | innodb_file_per_table                   | ON                     |  | innodb_flush_log_at_trx_commit          | 0                      |  | innodb_flush_method                     | O_DIRECT               |  | innodb_force_recovery                   | 0                      |  | innodb_lock_wait_timeout                | 50                     |  | innodb_locks_unsafe_for_binlog          | OFF                    |  | innodb_log_buffer_size                  | 8388608                |  | innodb_log_file_size                    | 536870912              |  | innodb_log_files_in_group               | 3                      |  | innodb_log_group_home_dir               | /var/lib/mysql         |  | innodb_max_dirty_pages_pct              | 90                     |  | innodb_max_purge_lag                    | 0                      |  | innodb_mirrored_log_groups              | 1                      |  | innodb_open_files                       | 300                    |  | innodb_rollback_on_timeout              | OFF                    |  | innodb_stats_method                     | nulls_equal            |  | innodb_stats_on_metadata                | ON                     |  | innodb_support_xa                       | ON                     |  | innodb_sync_spin_loops                  | 20                     |  | innodb_table_locks                      | ON                     |  | innodb_thread_concurrency               | 15                     |  | innodb_thread_sleep_delay               | 10000                  |  | innodb_use_legacy_cardinality_algorithm | ON                     |  +-----------------------------------------+------------------------+  

Bandwidth comparison between log shipping and transactional replication

Posted: 05 Apr 2013 04:23 PM PDT

Which technique uses more network bandwidth:

  1. Log shipping
  2. Transactional Replication

Can some one share any benchmarks for the same ?

What would be the Memory and I/O impact on the Primary server when we use any one of the technique ?

Thanks,

Piyush Patel

How to reinsert corrected rows from the conflict table?

Posted: 05 Apr 2013 04:23 PM PDT

I have a bidirectional merge replication. I had failure constraints because the primary key was just integer. I change the primary key to the old primary key + a location identifier.The problem is how can I reinsert the old rows of conflict table ( that I can correct manually from MSmerge_conflict_) to the publishers and subscribers. can you help me please?

sorry for making faults, I'm not english speaker

Automated SQL backup on a timely fashion, & cleaup the database for the backed up data

Posted: 05 Apr 2013 12:32 PM PDT

I need to back up SQL database (historian), on a timely fashion, and then clean up the database by removing the backed up data.

I am using MS SQL 2008 (R2), on a Windows XP machine. The biggest issue is the very limited hard disk space. The database is limited to a maximum of 3GB! In terms of overall performance, the PC is really slow, and unfortunately I do not have the choice to change that. So, I could consider backing up overnight when the data flow is expected to be less.

The intention is to back up the data every two weeks, have it stored in a special directory (e.g. c:\ ). Then an operator can move the backup to another machine. Given the limited space, I could consider some 'house clean up', by removing the backed up data. What is more important is the ability to merge the regular backups to an external database. So perhaps a typical SQL backup routine and restore, could be an option.

I would appreciate your kind advice regarding this matter. Thank you.

MySQL slave stuck in "Reading event from the relay log"?

Posted: 05 Apr 2013 12:09 PM PDT

  • 5.5.28-log MySQL Community Server (GPL) by Remi
  • binlog-format=MIXED

My problem is similar to this question.

*************************** 2. row ***************************       Id: 973415     User: system user     Host:        db: NULL  Command: Connect     Time: 29062    State: Reading event from the relay log     Info: NULL  

It looks like a bug, except for no one mentioned that verion 5.5.28 is effected.

Here're the additional informations:

mysql> show slave status\G

          Master_Log_File: mysql-bin.006413        Read_Master_Log_Pos: 40371120             Relay_Log_File: mysqld-relay-bin.011003              Relay_Log_Pos: 226592090      Relay_Master_Log_File: mysql-bin.006392           Slave_IO_Running: Yes          Slave_SQL_Running: Yes            Replicate_Do_DB:         Replicate_Ignore_DB:          Replicate_Do_Table:      Replicate_Ignore_Table:     Replicate_Wild_Do_Table:   Replicate_Wild_Ignore_Table:                  Last_Errno: 0                 Last_Error:                Skip_Counter: 0        Exec_Master_Log_Pos: 226591944            Relay_Log_Space: 5708184440            Until_Condition: None             Until_Log_File:               Until_Log_Pos: 0         Master_SSL_Allowed: No         Master_SSL_CA_File:          Master_SSL_CA_Path:             Master_SSL_Cert:           Master_SSL_Cipher:              Master_SSL_Key:       Seconds_Behind_Master: 29531  

mysql> show engine innodb status;

---TRANSACTION 1709EE56, ACTIVE 29524 sec fetching rows  mysql tables in use 8, locked 8  94647 lock struct(s), heap size 14121400, 45755176 row lock(s), undo log entries 7547  MySQL thread id 973415, OS thread handle 0x40d90940, query id 1773214543 Reading event from the relay log  

mysqlbinlog --start-position=226591944 mysql-bin.006392

#130401 15:18:23 server id 248  end_log_pos 226592078   Table_map: `reportingdb`.`v3_cam_ip` mapped to number 4973102  #130401 15:18:23 server id 248  end_log_pos 226593102   Delete_rows: table id 4973102  #130401 15:18:23 server id 248  end_log_pos 226594131   Delete_rows: table id 4973102  #130401 15:18:23 server id 248  end_log_pos 226595169   Delete_rows: table id 4973102  

The command that the user run on the master:

DELETE FROM v3_cam_ip WHERE dt ='2013-03-15' LIMIT 100000  

mysql> show create table v3_cam_ip\G

*************************** 1. row ***************************         Table: v3_cam_ip  Create Table: CREATE TABLE `v3_cam_ip` (    `campaignid` mediumint(9) unsigned DEFAULT NULL,    `ip` varchar(20) COLLATE latin1_bin NOT NULL DEFAULT '',    `dt` date NOT NULL,    KEY `ix_campaignid` (`campaignid`),    KEY `ix_dt` (`dt`)  ) ENGINE=InnoDB DEFAULT CHARSET=latin1 COLLATE=latin1_bin  /*!50100 PARTITION BY RANGE (TO_DAYS(dt))  (PARTITION p00 VALUES LESS THAN (0) ENGINE = InnoDB,   PARTITION p01 VALUES LESS THAN (734653) ENGINE = InnoDB,   PARTITION p02 VALUES LESS THAN (734745) ENGINE = InnoDB,   PARTITION p05 VALUES LESS THAN (734965) ENGINE = InnoDB,   PARTITION p06 VALUES LESS THAN (735160) ENGINE = InnoDB,   PARTITION p07 VALUES LESS THAN (735210) ENGINE = InnoDB,   PARTITION MERGER_201341 VALUES LESS THAN (735324) ENGINE = InnoDB,   PARTITION pcurrent_201341 VALUES LESS THAN (735325) ENGINE = InnoDB) */  

mysql> show keys from v3_cam_ip\G

*************************** 1. row ***************************          Table: v3_cam_ip     Non_unique: 1       Key_name: ix_campaignid   Seq_in_index: 1    Column_name: campaignid      Collation: A    Cardinality: 20       Sub_part: NULL         Packed: NULL           Null: YES     Index_type: BTREE        Comment:   Index_comment:   *************************** 2. row ***************************          Table: v3_cam_ip     Non_unique: 1       Key_name: ix_dt   Seq_in_index: 1    Column_name: dt      Collation: A    Cardinality: 20       Sub_part: NULL         Packed: NULL           Null:      Index_type: BTREE        Comment:   Index_comment:   2 rows in set (0.95 sec)  

What I have done on one of two Slaves:

  • stop slave; hangs so I have to press Ctrl-C
  • then I tried to restart the MySQL (using init script) but it failed
  • after that I did a kill -9 (silly?) and start again
  • now the slave is started but it doesn't get update from the master

What should I do on the remaining slave?


gdb --batch --quiet -ex 'set pagination off' -ex 'thread apply all bt full' -ex 'quit' -p $(pidof mysqld)

#4  0x0000000000926b99 in ha_partition::handle_unordered_next(unsigned char*, bool) ()  No symbol table info available.  #5  0x0000000000733de9 in Rows_log_event::find_row(Relay_log_info const*) ()  No symbol table info available.  #6  0x0000000000734079 in Delete_rows_log_event::do_exec_row(Relay_log_info const*) ()  No symbol table info available.  #7  0x0000000000734a92 in Rows_log_event::do_apply_event(Relay_log_info const*) ()  No symbol table info available.  #8  0x000000000051774e in apply_event_and_update_pos(Log_event*, THD*, Relay_log_info*) ()  

The full backtrace: http://fpaste.org/pXvT/

Which database could handle storage of billions/trillions of records?

Posted: 05 Apr 2013 08:30 PM PDT

We are looking at developing a tool to capture and analyze netflow data, of which we gather tremendous amounts of. Each day we capture about ~1.4 billion flow records which would look like this in json format:

{     "tcp_flags": "0",     "src_as": "54321",     "nexthop": "1.2.3.4",     "unix_secs": "1352234521",     "src_mask": "23",     "tos": "0",     "prot": "6",     "input": "105",     "doctets": "186",     "engine_type": "0",     "exaddr": "2.3.4.5",     "engine_id": "2",     "srcaddr": "9.8.7.6",     "dst_as": "12345",     "unix_nsecs": "752265174",     "sysuptime": "2943529544",     "dst_mask": "24",     "dstport": "80",     "last": "2943523241",     "srcport": "52672",     "dpkts": "4",     "output": "111",     "dstaddr": "6.5.4.3",     "first": "2943517993"  }  

We would like to be able to do fast searches (less than 10 seconds) on the data set, most likely over narrow slices of time (10 - 30 mintes intervals). We also want to index the majority of the data points so we can do searches on each of them quickly. We would also like to have an up to date view of the data when searches are executed. It would be great to stay in the open source world, but we are not opposed to looking at proprietary solutions for this project.

The idea is to keep approximately one month of data, which would be ~43.2 billion records. A rough estimate that each record would contain about 480 bytes of data, would equate to ~18.7 terabytes of data in a month, and maybe three times that with indexes. Eventually we would like to grow the capacity of this system to store trillions of records.

We have (very basically) evaluated couchbase, cassandra, and mongodb so far as possible candidates for this project, however each proposes their own challenges. With couchbase the indexing is done at intervals and not during insertion of the data so the views are not up to date, cassandra's secondary indexes are not very efficient at returning results as they typically require scanning the entire cluster for results, and mongodb looks promising but appears to be far more difficult to scale as it is master/slave/sharded. Some other candidates we plan to evaluate are elasticsearch, mysql (not sure if this is even applicable), and a few column oriented relational databases. Any suggestions or real world experience would be appreciated.

Refactoring/normalization - but almost empty table

Posted: 05 Apr 2013 11:54 AM PDT

I normalized a legacy DB into this structure: enter image description here

But I'm not sure if it is correctly normalized. I don't feel very comfortable with the almost empty filters table.

The requirements

  • A group contains several items.
  • A filter organizes how a item is "displayed".
  • Multiple terms are defining a filter.
  • Different groups may contain the same item, but different filter.
  • A filter cam have no terms.

Not really important requirements

  • A filter might be used several times.

Btw. the diagram was created using http://cakeapp.com.

Using MySQL InnoDB as an Archive

Posted: 05 Apr 2013 01:59 PM PDT

My site has a main MySQL InnoDB table that it does most of its work on. New rows get inserted at a rate of 1 million per week, and rows older than a week gets moved over to an archive table on a daily basis. These archived rows are processed once a week for stuff like finding trends.

This archive table consequently grows at 1 million new rows every week, and querying it can get really slow. Is MySQL suited for archiving data, or is my strategy very flawed?

Please advise, thank you!

ORA-16000 when trying to perform select on read only access ORACLE database

Posted: 05 Apr 2013 12:59 PM PDT

My application's SQL encounters ORA-16000 when trying to access read only Oracle Database

ORA-16000: database open for read-only access ORA-06512: at "SYS.XMLTYPE",   line 272 ORA-06512: at line 1 ### The error may exist in com/xyz/report/dao/Transaction001Mapper.xml  

This is the query that involves the XMLTYPE, the INTERFACE_CONTENT is a CLOB COLUMN :

SELECT CONTENTS FROM ERRCODES WHERE          CODE=(SELECT xmltype(INTERFACE_CONTENT).extract('/Message/PaymentStatus/ps:FIToFIPmtStsRpt/ps:TxInfAndSts/ps:StsRsnInf/ps:Rsn/ps:Prtry/text()','xmlns="urn:abcde" xmlns:head="urn:iso:std:iso:20022:tech:xsd:head.001.001.01" xmlns:ps="urn:iso:std:iso:20022:tech:xsd:pacs.002.001.03"').getstringval() APP_CODE  FROM MESSAGE_EXTERNAL_INTERACTION MEI WHERE MEI.MSG_TYPE='Pacs_002'      AND MEI.MID='MY_ID')  

I also did A lot OF EXTRACTVALUE( ) method on an XML FIELD TYPE.

The SQL is working perfectly if the Database is not read only ( read write ).

My Question here is what is the issue here - Is this related to some missing priviledges/grant ?

MySQL Table not repairing

Posted: 05 Apr 2013 07:59 PM PDT

Table info:

Database name: user_motiva  Table name: wp_options.frm  wp_options.MYD  wp_options.MYI  wp_options.TMD  

when I do a mysqlcheck -r --all-databases it gets hung on that table even if you let it sit all day.

Is there anther way to fix/repair/recover that table?

Should I use myisamchk? I saw something like:

shell> myisamchk --recover City  

My config on a 16GB ram box

 cat /etc/my.cnf  [mysqld]  default-storage-engine=MyISAM  local-infile=0  symbolic-links=0  skip-networking  max_connections = 500  max_user_connections = 20  key_buffer = 512M  myisam_sort_buffer_size = 64M  join_buffer_size = 64M  read_buffer_size = 12M  sort_buffer_size = 12M  read_rnd_buffer_size = 12M  table_cache = 2048  thread_cache_size = 16K  wait_timeout = 30  connect_timeout = 15  tmp_table_size = 64M  max_heap_table_size = 64M  max_allowed_packet = 64M  max_connect_errors = 10  query_cache_limit = 1M  query_cache_size = 64M  query_cache_type = 1  low_priority_updates=1  concurrent_insert=ALWAYS  log-error=/var/log/mysql/error.log  tmpdir=/home/mysqltmp  myisam_repair_threads=4  [mysqld_safe]  open_files_limit = 8192  log-error=/var/log/mysql/error.log    [mysqldump]  quick  max_allowed_packet = 512M    [myisamchk]  key_buffer = 64M  sort_buffer = 64M  read_buffer = 16M  write_buffer = 16M  

and could this have happened because of a crashed table from doing killall -9 mysqld because it would not shutdown and restart?

EDIT:

root@server [/var/lib/mysql/user_motiva]# myisamchk -e *.MYI  Checking MyISAM file: wp_options.MYI  Data records:    1827   Deleted blocks:       3  myisamchk: warning: 3 clients are using or haven't closed the table properly  - check file-size  - check record delete-chain  - check key delete-chain  - check index reference  - check data record references index: 1  - check data record references index: 2  - check records and index references  MyISAM-table 'wp_options.MYI' is usable but should be fixed  root@server [/var/lib/mysql/user_motiva]# myisamchk --safe-recover wp_options.MYI  - recovering (with keycache) MyISAM-table 'wp_options.MYI'  Data records: 1827  myisamchk: error: Can't create new tempfile: 'wp_options.TMD'  MyISAM-table 'wp_options.MYI' is not fixed because of errors  Try fixing it by using the --safe-recover (-o), the --force (-f) option or by not using the --quick (-q) flag  root@ns2 [/var/lib/mysql/user_motiva]# myisamchk -o -f wp_options.MYI  - recovering (with keycache) MyISAM-table 'wp_options.MYI'  Data records: 1827  

Does this mean that it is now fixed? If so how do I move it back? (this was done on a different server) Is there a way to maybe bring MySQL down on the main server and run a command to fix all the files?

Group By days interval (aging type)

Posted: 05 Apr 2013 10:59 AM PDT

I would lIke to have Mysql group by days interval for example group by every 20 days from current day, like 1 - 20, 21 - 40, 41 - 60 and son on up to lets 120 days. The user can choose the days interval and up to how many days

How to run a cold backup with Linux/tar without shutting down MySQL slave?

Posted: 05 Apr 2013 01:07 PM PDT

I run the following before tar-ing up the data directory:

STOP SLAVE;  FLUSH TABLES WITH READ LOCK;  FLUSH LOGS;   

However, tar will sometimes complain that the ibdata* and ib_logfiles* files are updated during the process. What am I missing?

The slave machine is in a cold standby machine so there are no client processes running while tar is running.

CentOS release 5.6 64bits, MySQL 5.1.49-log source distribution.

Mysql innoDB write operations are extremely slow

Posted: 05 Apr 2013 08:59 PM PDT

I'm having serious performance problems with MySQL and the InnoDB engine. Even the simplest table makes writing operations (creating the table, inserting, updating and deleting) horribly slow, as you can see in the following snippet.

mysql> CREATE TABLE `test` (`id` int(11) not null auto_increment,     -> PRIMARY KEY(`id`)) ENGINE=InnoDB;  Query OK, 0 rows affected (4.61 sec)    mysql> insert into test values ();  Query OK, 1 row affected (1.92 sec)    mysql> insert into test values ();  Query OK, 1 row affected (0.88 sec)    mysql> insert into test values ();  Query OK, 1 row affected (1.10 sec)    mysql> insert into test values ();  Query OK, 1 row affected (6.27 sec)    mysql> select * from test;  +----+  | id |  +----+  |  1 |  |  2 |  |  3 |  |  4 |  +----+  4 rows in set (0.00 sec)    mysql> delete from test where id = 2;  Query OK, 1 row affected (0.28 sec)    mysql> delete from test where id = 3;  Query OK, 1 row affected (6.37 sec)  

I have been looking at htop and the long waiting times are not because of abnormal CPU load. It's almost zero, and memory usage is also normal. If I create the same table using the MyISAM engine, then it works normally. My my.cnf file contains this (if I remember right I haven't changed anything from the default Debian configuration):

[client]  port        = 3306  socket      = /var/run/mysqld/mysqld.sock  [mysqld_safe]  socket      = /var/run/mysqld/mysqld.sock  nice        = 0    [mysqld]  user        = mysql  pid-file    = /var/run/mysqld/mysqld.pid  socket      = /var/run/mysqld/mysqld.sock  port        = 3306  basedir     = /usr  datadir     = /var/lib/mysql  tmpdir      = /tmp  language    = /usr/share/mysql/english  skip-external-locking  bind-address        = 127.0.0.1  key_buffer      = 40M  max_allowed_packet  = 16M  thread_stack        = 128K  thread_cache_size       = 8  myisam-recover         = BACKUP  max_connections        = 100  table_cache            = 64  thread_concurrency     = 10  query_cache_limit   = 1M  query_cache_size        = 40M  log_slow_queries    = /var/log/mysql/mysql-slow.log  long_query_time = 2  log-queries-not-using-indexes  expire_logs_days    = 10  max_binlog_size         = 100M    [mysqldump]  quick  quote-names  max_allowed_packet  = 16M    [isamchk]  key_buffer      = 16M  !includedir /etc/mysql/conf.d/  

I have also tried to restart the server, but it doesn't solve anything.

The slow queries log doesn't give any extra information.

PostgreSQL: Unable to run repmgr cloned database

Posted: 05 Apr 2013 04:41 PM PDT

I'm running tests with PostgreSQL hot standby with 1 master, and exactly 1 slave.

I am using the instructions on this guide: http://www.howtoforge.com/how-to-set-up-a-postgresql-9.0-hot-standby-streaming-replication-server-with-repmgr-on-opensuse-11.4

I'm using PostgreSQL version 9.1, repmgr 1.1.0 and Ubuntu 10.04 LTS.

I followed steps upto step-6 in the guide where I ran

repmgr -D /var/lib/postgresql/9.1/main -d pgbench -p 5432 -R postgres --verbose standby clone pgmaster  

on pgslave.

Then I did a

/etc/init.d/postgresql start  

on it and it the script (seemingly) finished successfully.
However, executing psql throws error:

psql: FATAL: the database system is starting up

Any help on proceeding further is welcome.

Which database is best for deeply embedded database or through C DLL?

Posted: 05 Apr 2013 08:48 PM PDT

I want a deeply embedded database. Deeply embedded means that the server is started by an application and closed by the application with no tcp/ip or 0 port. The main features of consideration are:

  1. At least 10 GB Database
  2. Database encryption and search in encryption
  3. Full text Search
  4. Possible updating of database through Online [This is only additional benifit]
  5. Must have C DLL

There are many options available like MySQL, Oracle, MSSQL. An opensource database would be great.

What can cause statistics to get out of line?

Posted: 05 Apr 2013 03:11 PM PDT

I've just worked through a problem at a clients site which it turned out was caused by the statistics being wrong which caused the Optimizer to time out. Running exec sp_updatestats fixed the problem and all is now good.

What I am a bit confused about now is how did the statistics get out of line in the first place?

The database has both auto_create_stats and auto_update_stats switched on. So SQL Server should have kept the statistics up to date without any intervention.

So why did it fail in this instance?

This client had recently upgraded their database server. They handled it themselves, so I'm not exactly sure what procedure they went through, but I can't imagine it was anything more complicated than backing the database up on the old server and restoring it on the new one. Could this have caused the glitch somehow?

No comments:

Post a Comment

Search This Blog