Monday, May 27, 2013

[how to] Eliminating duplicate records in data cleansing

[how to] Eliminating duplicate records in data cleansing


Eliminating duplicate records in data cleansing

Posted: 27 May 2013 07:28 PM PDT

I have a database full of records of people with simple information like first name, last name, email, location, ... . I need to eliminate the duplicate records. As I've search the process is called "duplicate elimination in Data Cleansing". Does anyone know a good open source tool to do that?

Failed to set releationship for queries

Posted: 27 May 2013 09:05 PM PDT

I have two tables:

table 1  code           description               sku                quantity  1001            TV                       usa                   100  1002            camera                   gcc                   200  1003            elec item                usa                   300  

code is the primary key

table 2  code           description            sku                Order  1001           TV                     usa                50  1002           camera                 gcc                75  1005           mech item              gcc                20  

code is the primary key, and sku is common so the query result should look like this:

code           description               sku                quantity         Order  1001              TV                     usa                     100             50  1002              camera                 gcc                     200             75  1003              elec item              usa                     300             blank  1005              mech item              gcc                     blank           20  

Note here, 1005 is only in table 2, its not in table 1. I have tried many ways to solve the problem, but the 1005 record is not coming back from my query.

Can't get rid of filesort on joined query sorting on second table

Posted: 27 May 2013 03:20 PM PDT

For some reason, this query:

EXPLAIN SELECT * FROM biz as b INNER JOIN listings as l ON b.id = l.biz_id    WHERE l.pub_id = 14 AND b.cat_id=310 ORDER BY l.level DESC, l.random DESC LIMIT 5;  

says it would use a filesort:

+----+-------------+-------+--------+---------------+------------------+---------+----------------------------------------+------+---------------------------------+  | id | select_type | table | type   | possible_keys | key              | key_len | ref                                    | rows | Extra                           |  +----+-------------+-------+--------+---------------+------------------+---------+----------------------------------------+------+---------------------------------+  |  1 | SIMPLE      | b     | ref    | [...]         | cat_id           | 4       | const                                  | 1160 | Using temporary; Using filesort |  |  1 | SIMPLE      | l     | ref    | [...]         | biz_id           | 4       | b.id                                   |    1 | Using where                     |  +----+-------------+-------+--------+---------------+------------------+---------+----------------------------------------+------+---------------------------------+  

I have tried setting up several indexes on listings that include biz_id, pub_id, level and random, but no need to bore you with my failings; what would be the proper index to use to avoid a filesort in this query? Or is it impossible?

Statistical Analysis of Data that has to be done in an order?

Posted: 27 May 2013 04:23 PM PDT

Bear with me - that is the first time try that in SQL Server, normally I have been doing that on the front end ;)

I a implementing some analysis on time coded data series. This is not super complicated stuff, but some of it requires some numbers we do not store in the database and that has to be calculated by aggregating the numbers in a specific algorithm IN ORDER.

To give an example:

  • I have a list of trades and I need to know the maximum loss we had in the account, so i need to aggregate the plus/minus and then take the most extreme negative and positive.

This can not be pre-calculated due to dynamic filtering - there are a number of filters that can be applied to the data.

So far - past - I pulled the data to the application, now for the standard stuff I plan to try to keep that in the sql server.

My problem now is - I can see how that works (acceptable) in SQL Server:

[TradeBegin],  Pnl,  SUM (Pnl) OVER (ORDER BY [TradeBegin] ROWS UNBOUNDED PRECEDING)  [Total]  

But if I put that into a view... and then filter out rows, the Sum is still calcualted from the beginning. And I need a view because I want (need) to map that standard analysis data into an ORM (so dynamic SQL is out). Anyone an idea how to do that?

Cannot see Green Button with white arrow in object explorer; DTC config problems?

Posted: 27 May 2013 01:35 PM PDT

I have SQL Servers across locations setup for replication. One of the SQL Server instances that I installed, running locally I can see the green button with the white arrow in the object explorer when I connect to it. However, when I connect to it from any other location, I cannot see that or neither can I start or Stop the SQL Serer Agent; even though locally I am able to do that.

Additionally, the MDTC doesn't has only 2 items sent both with were rejected, where as other servers have 100's committed and sent. Is there something wrong with the DTC settings?

Please help.

SQL Server Memory for Backup Buffer

Posted: 27 May 2013 02:05 PM PDT

In SQL Server 2012 memory for backup buffer comes from buffer pool or non buffer pool (VAS)? Is this behavior changed between SQL server 2008 R2 and SQL 2012?

EDIT: Does MAX Server Memory configuration setting now controls memory for backup buffer? Historically, this was coming out of MemToLeave (or Virtual Address Space) outside SQL Server buffer pool?

UTF8 vs ASCII or custom binary format: high-performance optimization on very big tables

Posted: 27 May 2013 12:25 PM PDT

The summary of my question is whether there are advantages, even if minimal, to use ASCII, or even a smaller format designed yet created specifically for the situation, instead of UTF-8 for strings.

Is possible to use webserver who will access the data translate one ASCII string to a more compact format.

On this question, just 1-3 tables on database handle a massive amount of data, so any byte can make difference if could fit on memory instead of access disk. The information will be accessed via a RESTfull service

Read vs Writes

Project need much more read than writes. But writing has a peculiarity: every 10 seconds an amount between 40-300 new rows are inserted into the main table. These could be written in parallel, since do not depend of each other.

Memory vs Disk usage

Recent inserted rows, that will be used immediately, will be inserted also on a cache for the webservice use, so no need to read they again. But for search on old records, the database will need, and it should be fast.

This is why I suppose that use less bytes to store some fields will make diference: even for larger amount of data, will be more easy for fit on memory.

If I cannot fit data on memory, and database cannot abstract to me some speed, or I will need to force slow table/partition scans each 10 seconds for just one user, or I will be forced to do single select and cache it on Webserver, but this break concept of "stateless" from REST concept.

Characters that must be supported

0-9,A-Z, "-", "_". Maybe will need "a-z". Just 38 or 64 caracters, and never more than this.

For now, most colluns are CHAR(3), CHAR(6), VARCHAR(8) and VARCHAR(10). Examples:

  • ABC, XYZ
  • AB-EFG, XY-XPT
  • ABC123457
  • E47F6C, 34210A, E48D37 (hex strings, maybe exist a format specific to this?)

Technologies used

Database will be MariaDB. Maybe part of RAW data will be on some NoSQL database. The language of webservice does not really makes diference here, but will be PHP 5.4 with framework Phalcon PHP.

Different types of cache could be used, from Varnish-cache to APC/Mencached untill caches inside database.

My initial hypothesis

One field with VARCHAR(10) needs 80bits. If I use one custom table to use just 64 or less instead 255 characters, I save around 4 times, so instead of 80bits will use just 20bits.

Can I limit an index to only contain the newest relevant row?

Posted: 27 May 2013 06:46 PM PDT

I have a table in PostgreSQL 9.2 that records login sessions called user_login_session. This table contains a user_id, start_timestamp and end_timestamp columns, amongst others. This table has started growing as my system gains traction, and I can't purge old records for now.

I've defined an index on this table but I'm concerned about its growth over time. Is it possible to define an index which only keeps a handle to the newest login session row per user ID? Older rows are not accessed operationally.

Exadata exachk.sh

Posted: 27 May 2013 11:48 AM PDT

I would like to run exachk.sh on regular basis. Now I have to interact with the script and answer all kinds of password questions before the script starts to extract exadata information.

I want to automate this and I am looking for ways of doing this. Do I really need to modify the exachk.sh script? The problem with this is that this script is fairly often updated on metalink.

Suggestions?

Thanks,

Bjarte Brandt

Optimum configuration for 4 SSD drives

Posted: 27 May 2013 12:40 PM PDT

I would like to move a SQL Server database from a shared configuration with the web-server to it's own dedicated box. My current budget will allow me to put 4 disks together in an array with a single hot spare. I'd like to stretch to 8+ drives but the costs are a little out of my budget right now (and probably somewhat overkill).

So my question is, what would be the optimum configuration for SQL Server 2012 when limited to 4 disks? The database is around 29 GB and is growing around 250-500 MB per month. The database will typically deliver 80% reads to 20% inserts/updates/deletes.

I understand from researching this subject that my options are as follows:

  1. RAID 5 with all functions on a single volume.
  2. Dual RAID 1 with OS, Backups and system dbs on one volume and database + log files on the other.
  3. RAID 10 with all function on a single volume.

I'm looking for a solution that will give me reasonable performance but won't obliterate the array if a single drive fails (which I understand is quite common with SSDs).

Current Hardware ------------------

HP ProLiant DL360 G7 1 x Xeon E5640 / 2.66 GHz - RAM 12 GB - 2 x 300GB Pluggable SAS SFF 10,000 rpm disks in RAID 1.

Migrating oracle to sybase using insert into fails due to missing functions

Posted: 27 May 2013 12:26 PM PDT

I generated the DDL from oracle to sybase, created the tables with no issues even migrated most tables into sybase using insert into, however when came to the time functions I get errors:

Function 'to_date' not found. If this is a SQLJ function or SQL function, use sp_help to check whether the object exists (sp_help may produce a large amount of output).  

and

Function 'to_timestamp' not found. If this is a SQLJ function or SQL function, use sp_help to check whether the object exists (sp_help may produce a large amount of output).  

How to restore sql server databse with mirroring

Posted: 27 May 2013 04:28 PM PDT

I have sql server with mirroring. I am trying to restore the database using the command:

USE [master]  ALTER DATABASE [CentralStorage] SET SINGLE_USER WITH ROLLBACK IMMEDIATE  RESTORE DATABASE [CentralStorage] FROM  DISK = N'D:\Program Files\Microsoft SQL Server\MSSQL11.MSSQLSERVER\MSSQL\Backup\XXX.bak' WITH  FILE = 1,  NOUNLOAD,  REPLACE,  STATS = 5  ALTER DATABASE [XXX] SET MULTI_USER  

I tried the followings: I set the "remove tail log" I set the "close connections" override existing database.

I get the error:

Msg 1468, Level 16, State 1, Line 2  The operation cannot be performed on database "XXX" because it is involved in a database mirroring session or an availability group. Some operations are not allowed on a database that is participating in a database mirroring session or in an availability group.  Msg 5069, Level 16, State 1, Line 2  ALTER DATABASE statement failed.  Msg 3104, Level 16, State 1, Line 3  RESTORE cannot operate on database 'XXX' because it is configured for database mirroring or has joined an availability group. If you intend to restore the database, use ALTER DATABASE to remove mirroring or to remove the database from its availability group.  Msg 3013, Level 16, State 1, Line 3  RESTORE DATABASE is terminating abnormally.  

Additional info: Microsoft SQL Server 2012 (SP1) - 11.0.3000.0 (X64)

Is there any way to Restore a Database from newer version to an Older Version of SQL Server? [duplicate]

Posted: 27 May 2013 11:21 AM PDT

This question already has an answer here:

I am having a database backup file from SQL SERVER 2012 and i wanted to restore it on SQL SERVER 2008R2 . If we can't restore a SQL Server 2012 database to earlier version what is the next best option? So Is their any way to restore by any other way ??

MySQL error 2006 MySQL server has gone away after upgrade to 5.6.10 using homebrew on Max OS X 10.8.3

Posted: 27 May 2013 12:43 PM PDT

I upgraded my installation of MySQL on my Mac OS X 10.8.3 using homebrew

brew install mysql

Everything seemed to go smoothly, but...not really, as it turns out.

I cannot add a new user (neither through the command line nor through phpmyadmin. I always get the error

Error in processing request: #2006 MySQL server has gone away

I can create new databases without a problem, and I can install, for example Joomla 3.1.1 with no problems.

However, when I try to install a new extension into my Joomla installation, I get errors. For example, I get the following error when I try to install a component from NoNumber:

1062 Duplicate entry '1' for key 'PRIMARY' SQL=INSERT INTO `u5z87_extensions` (`name`,`type`,`element`,`folder`,`client_id`,`enabled`,`access`) VALUES ('plg_system_nnframework','plugin','nnframework','system','0','1','1')  

Same thing for many other extensions.

I tried to start mysql with

mysql.server start --log-warnings=2

and it started with no problem

Starting MySQL   SUCCESS!  

I looked up issues where error #2006 were being reported and how to solve them, and they referred to modifying /etc/my.cnf but there is no /etc/my.cnf configuration file on my system, and there wasn't one in the previous installation of MySQL 5.5.29 (also installed with homebrew) and I had no problems.

I ran brew doctor and the output is below

$ brew doctor  Warning: "config" scripts exist outside your system or Homebrew directories.  `./configure` scripts often look for *-config scripts to determine if  software packages are installed, and what additional flags to use when  compiling and linking.    Having additional scripts in your path can confuse software installed via  Homebrew if the config script overrides a system or Homebrew provided  script of the same name. We found the following "config" scripts:        /usr/local/opt/php54/bin/php-config  

The output from brew --config is also below:

$ brew --config  HOMEBREW_VERSION: 0.9.4  ORIGIN: https://github.com/mxcl/homebrew  HEAD: 86ce842c700a47abdc354e80cc3b0b3938ab2132  HOMEBREW_PREFIX: /usr/local  HOMEBREW_CELLAR: /usr/local/Cellar  CPU: 8-core 64-bit nehalem  OS X: 10.8.3-x86_64  Xcode: 4.6.2  CLT: 4.6.0.0.1.1365549073  LLVM-GCC: build 2336  Clang: 4.2 build 425  X11: 2.7.4 => /opt/X11  System Ruby: 1.8.7-358  Perl: /usr/bin/perl  Python: /usr/bin/python  Ruby: /Users/alisamii/.rvm/rubies/ruby-1.9.3-p374/bin/ruby  

Exploring a little further, I tried to revert to my previous install of MySQL using

brew switch mysql 5.5.29

This resulted in the following output:

$ brew switch mysql 5.5.29  Cleaning /usr/local/Cellar/mysql/5.5.29  Cleaning /usr/local/Cellar/mysql/5.6.10  135 links created for /usr/local/Cellar/mysql/5.5.29  

I then accessed mysql from the command line and ran status:

mysql> status  --------------  mysql  Ver 14.14 Distrib 5.5.29, for osx10.8 (i386) using readline 5.1    Connection id:      13  Current database:  Current user:       alisamii@localhost  SSL:            Not in use  Current pager:      stdout  Using outfile:      ''  Using delimiter:        ;  Server version:     5.6.10 Source distribution  Protocol version:       10  Connection:         Localhost via UNIX socket  Server characterset:    utf8  Db     characterset:    utf8  Client characterset:    utf8  Conn.  characterset:    utf8  UNIX socket:        /tmp/mysql.sock  Uptime:         1 hour 56 min 24 sec  

As you can see above, it reports mysql Ver 14.14 Distrib 5.5.29, for osx10.8 (i386) using readline 5.1

But, it also says Server version: 5.6.10 Source distribution

So something isn't syncing up.

Furthermore, when I show databases, I get the following output:

mysql> show databases      -> ;  +--------------------+  | Database           |  +--------------------+  | information_schema |  +--------------------+  1 row in set (0.00 sec)  

But I know I have more than just an information_schema database. in phpMyAdmin, I have 14 databases displaying.

So, somehow, it seems to me that mysql is not connecting to the right server, as phpmyadmin reports the same server information that mysql's status command reports: Software version: 5.6.10 - Source distribution.

At this point, I am completely lost, and would really need some help.

Mysql query grouping by two columns for max date takes too long

Posted: 27 May 2013 06:13 PM PDT

I have table that looks like this:

l_p  l_a     l_timestamp          l_n   l_act   l_name  123  321     2011-1-1T01:00:00Z   B     q       das  21   23      2012-1-1T01:00:00Z   C     q       sd  123  23      2013-1-1T01:00:00Z   D     q       sad  21   321     2008-1-1T01:00:00Z   D     q       sad  21   23      2007-1-1T01:00:00Z   E     q       sad  

The timestamp field is a binary. All other fields are varbinary. I am not interested in the l_name field but it is relevant for the indexes.

What I want to get back is for each unique combination of l_p and l_a the row with the max timestamp. I also have some additional conditions: the timestamp should be smaller than 2013-1-1 and the l_act field should be q. The table I want to get would look something like this:

l_p  l_a   l_timestamp          l_n   l_act  123  321   2011-1-1T01:00:00Z   B     q  21   23    2012-1-1T01:00:00Z   C     q  21   321   2008-1-1T01:00:00Z   D     q  

The table contains about 50million records. My current query is extremely slow. It takes hours and still I haven't gotten a result yet. This is the query:

SELECT a.l_p,a.l_a,a.l_timestamp,  case a.l_new  when 'B' then 2  when 'C' then 3  ...  end as GN  FROM tmp a  INNER JOIN   (SELECT l_a,l_p,MAX(l_timestamp) as max_date  FROM tmp  WHERE l_act = 'q' AND (l_n != 'I' and l_n != 'Z')  AND l_timestamp < cast('2013-01-01T00:00:00Z' as binary)  GROUP BY l_p,l_a) b  on (a.l_p = b.l_p AND a.l_a = b.l_a AND  a.l_timestamp = b.max_date AND  a.l_act = 'q' AND  (l_n != 'I' and l_n != 'Z')  

I've also tried this variation with the same results:

SELECT a.l_p,a.l_a,a.l_timestamp,  case a.l_new  when 'B' then 2  when 'C' then 3  ...  end as GN  FROM tmp a  (SELECT l_a,l_p,MAX(l_timestamp) as max_date  FROM tmp  WHERE l_act = 'q' AND (l_n != 'I' and l_n != 'Z')  AND l_timestamp < cast('2013-01-01T00:00:00Z' as binary)  GROUP BY l_p,l_a) b  WHERE (a.l_p = b.l_p AND a.l_a = b.l_a AND  a.l_timestamp = b.max_date AND  a.l_act = 'q' AND  (l_n != 'I' and l_n != 'Z'))  

The indexes on the table are:

Keyname     Type         Unique    Packed     PRIMARY     BTREE        Yes       No     Field       Cardinality            Collation  l_project   18           A        l_name      18           A    l_a         24799255     A    l_act       49598510     A    l_timestamp 49598510     A    AND  Keyname     Type          Unique Packed   l_a         BTREE         No     No   Field       Cardinality          Collation  l_a         16532836      A       l_name      16532836      A   

Any ideas on why it takes so long? The table is innodb.

Check if bit is set in mysql

Posted: 27 May 2013 06:09 PM PDT

I have a field which stores the days of the week in binary representation. For example, Mon-Wed-Fri would be represented as b'0101010' with the 1st digit as Sunday and the last as Saturday.

What I want to achieve: If today is Monday, I want to check if the Monday bit is set to 1. I do that by doing (b'0101010' & b'0100000' > 0). But I am struggling in converting the current date into the corresponding weekday binary representation. How can I achieve so?

Clear schema from database without dropping it

Posted: 27 May 2013 03:58 PM PDT

I'm working on a school project where I have a SQL Server with a database for my team.

I already imported a local database created with Entity Framework.

Now the model has changed, table properties were added/deleted and I want to update my full database.

However, the teachers didn't gave us the create rights so dropping the whole database isn't really an option.

Now is my question, is it possible to drop all the tables currently in the database and just import the newly created one without problems? Or do I really need to drop the whole database?

Are there any disadvantages to partitioning on financial year?

Posted: 27 May 2013 10:55 AM PDT

Our current set up has one table per financial year (May 1- April 30). Each table has approx 1.5 million rows. We have about 8 years of data, and will obviously be adding each year.

The majority of queries are within the financial year/one partition. Either select * from sales where date time between '2013-05-01' and '2014-04-30 23:59:59' or some period within that year.

My plan is to have a range partition on an InnoDB table. e.g.

PARTITION BY RANGE COLUMNS(datetime)  (PARTITION p0 VALUES LESS THAN ('1999-05-01') ENGINE = InnoDB,   PARTITION p1 VALUES LESS THAN ('2000-05-01') ENGINE = InnoDB,   PARTITION p2 VALUES LESS THAN ('2001-05-01') ENGINE = InnoDB,  ...  

This means that the PK has to become PRIMARY KEY (index,datetime).

Are there any significant disadvantages to partitioning compared to having an unpartitioned table? I know that means the PK is now length 12 and all further indexes will have that prepended to it. Does that make a difference? The table needs to work faster on reads than writes, and there are a fair few indexes on it.

  • Is there any functional difference between partitioning on RANGE COLUMNS(datetime) and RANGE TO_DAYS(datetime) using MySQL 5.5+?
  • Is partitioning on datetime OK? One alternative is adding in a column for financial year and indexing/partitioning on that? (And then adding another index on datetime for other queries).

We do sometimes need to query the time across all time or over "the last X months", but this is pretty rare. The main advantages of moving to a single table is to eliminate the logic in the application working out which table to insert/update/select and not needing to calculate unions in those situations where we need more than one table.

SUPER privilege not defined for master user in Amazon MySQL RDS

Posted: 27 May 2013 01:12 PM PDT

I have created one medium instance on amazon rds in asia pecific (singapore) region. i have created my master user with master password. and it is working/connecting fine with workbench installed on my local PC. When, I am going to create function on that instance, it show me following error

ERROR 1418: This function has none of DETERMINISTIC, NO SQL, or READS SQL DATA in its declaration and binary logging is enabled (you might want to use the less safe log_bin_trust_function_creators variable)

At my instance, my variable (log_bin_trust_function_creators) shows OFF. now when I go to change with variable using

SET GLOBAL log_bin_trust_function_creators = 1;  

it gives me another error

Error Code: 1227. Access denied; you need (at least one of) the SUPER privilege(s) for this operation

I don't know how to solve this error.

Can anybody help???

How to import table's data in MySql from Sql Server?

Posted: 27 May 2013 02:12 PM PDT

I am trying to export table from SQL Server 2008 R2 TO MySql 5.5. For this I am using SQL Server Import and Export Wizard of Sql Server 2008, but it is giving this error. Error

Here this error may be occurring because table in Sql Server has a column with data type image and table in MySql this column's data type is LONGBLOB.

Please provide your expert answers. If not possible through SQL Server Import and Export Wizard than please suggest any other method for data transfer.

Is it possible to pipe the result of a mysqldump straight to rsync as the source argument?

Posted: 27 May 2013 03:12 PM PDT

Is it possible to pipe the result of a mysqldump straight to rsync as the source argument?

Conceptually, I was thinking something like:

mysqldump -u root -p database_to_backup > db_backup_file.sql | sudo rsync -avz db_backup_file.sql myuser@mysite.com:/var/www/db_backup_file.sql  

I've seen people pipe the result to mysql for their one liner backup solution, but I was curious if it was possible with rsync. You know--- cause rsync is magic :)

Thanks for your time!

Generate XML using Oracle XML schema

Posted: 27 May 2013 03:53 PM PDT

I have some XML schema definitions that have been registered using DBMS_XMLSCHEMA. I have created tables using the XMLTypes generated. I can create XML using XMLQuery and if I understand correctly, the query results can be inserted into the table if the definition is correct.

Ultimately the XML needs to be written to a file, but the part I am unsure about is if writing XMLQueries from scratch is the best way to generate the XML. Is there any way to have the XML schema generate an XML stub for a guide or generate template XMLQuery? The data is currently all in regular Oracle tables, but arranged quite differently, so any information on easing the export to XML based on a specific set of registered schema would be useful.

Import batches of rows from Excel in parallel

Posted: 27 May 2013 03:53 PM PDT

We receive an Excel file weekly currently containing about 250k rows. The file grows by about 300 rows each week.

In SSIS I've hooked up an Excel source to an OLE DB destination to import the file but it takes a long time.

I've looked at using the SQL Command to specify how to select part of the sheet but I can't see how to turn this in to a parallel operation.

Is it possible in SSIS to split the Excel file (by rows) and import it in parallel?

Connect Error (2054) - mysql_old_password issue still not solved

Posted: 27 May 2013 11:48 AM PDT

i use php 5.4.3 and i want to connect to a MySQL DB remotely (php script resides on a different server than the db) with normal oop way:

$mysqli = new mysqli(DB_HOST, DB_USER, DB_PASS, DB_NAME);  

and i get error:

Warning: mysqli::mysqli(): The server requested authentication method unknown to the client [mysql_old_password]  Connect Error (2054) The server requested authentication method unknown to the client  

I found this and this but they dont seem to help. I used php 5.2.9 and everything worked ok but lowering php is not a solution for me. I also have to mention that i dont have full access to DB's settings cause the client doesn't give me that full access. I told him to ask hostgator for newer password format support, but they answered that they cannot do anything which was not what i was expecting. I did a new effort from another remote server and i get now this:

Connect Error (1045) Access denied for user 'user'@'host' (using password: YES)  

by using php 5.3.9 It looks the DB doesn't let me in because of the remote server's php script but with oracle's workbench everything works fine. Thanks for your time.

MySQL Workbench sync keeps requesting the same changes

Posted: 27 May 2013 06:12 PM PDT

I am using MySQL Workbench, and when I try to "synchronize" it with my remote database, it keeps detecting some changes to make.

Specifically, the most recurrent ones are:

  • Dropping foreign keys and re-creating the same ones again
  • Changing the collation

I was compliant and executed all the queries given to me (and added the semi-colon that they forgot). MySQL didn't complain and executed them.

However it didn't help, I can run it 20 times in a row, it will still ask the same useless changes.

Deleting Data From Multiple Tables

Posted: 27 May 2013 11:12 AM PDT

Suppose,I've a table called UNIVERSITY containing universities name:

universityID    universityNAME  isACTIVE       7            GNDU             1       6            PU               1       5            PTU              1       8            LPU              1  

Now these universities ID's has been(obviously) used in many tables within the database(name e.g.Education),Suppose 10 tables.

Q.Now what happen if i delete one university?

A.The universityID field in other tables becomes NULL.

But I don't want these,rather when I delete 1 university from UNIVERSITY TABLE,all its occurrences with Rows in all 10 table should get deleted.

What will be the shortest and easiest MySQL Query for this operation.

NOTE:I'm using PHP language.

Minimizing Indexed Reads with Complex Criteria

Posted: 27 May 2013 12:12 PM PDT

I'm optimizing a Firebird 2.5 database of work tickets. They're stored in a table declared as such:

CREATE TABLE TICKETS (    TICKET_ID id PRIMARY KEY,    JOB_ID id,    ACTION_ID id,    STATUS str256 DEFAULT 'Pending'  );  

I generally want to find the first ticket that hasn't been processed and is in Pending status.

My processing loop would be:

  1. Retrieve 1st Ticket where Pending
  2. Do work with Ticket.
  3. Update Ticket Status => Complete
  4. Repeat.

Nothing too fancy. If I'm watching the database while this loop runs I see the number of indexed reads climbs for each iteration. The performance doesn't seem to degrade terribly that I can tell, but the machine I'm testing on is pretty quick. However, I've received reports of performance degradation over time from some of my users.

I've got an index on Status, but it still seems like it scans down the Ticket_Id column each iteration. It seems like I'm overlooking something, but I'm not sure what. Is the climbing number of indexed reads for something like this expected, or is the index misbehaving in some way?

-- Edits for comments --

In Firebird you limit row retrieval like:

Select First 1    Job_ID, Ticket_Id  From    Tickets  Where    Status = 'Pending'  

So when I say "first", I'm just asking it for a limited record set where Status = 'Pending'.

How to remove a database server from Utility Explorer

Posted: 27 May 2013 12:23 PM PDT

I use the Utility Explorer to monitor some of my databases but I forgot to remove one before retiring the server. Normally, you could just right click and select "Remove Managed Instance" but I can't do that since the server has been wiped from existence. Now it is stuck in my list without any data

What Problems Does an Eager Spool Indicate

Posted: 27 May 2013 10:47 AM PDT

Currently running on SQL Server 2008 R2

I am attempting to increase performance of an UPDATE statement. I notice an Eager Spool operation in the showplan popping up. My understanding of spooling operations is pretty basic - they create temporary storage for the table during the update.

I also know that, while they are preventing much worse execution times, eager spools are often indicative of underlying problems with table structure and/or query statements.

My question is pretty simple: When you see an Eager Spool in your query plan, what problems do you first look to address?

I will be analyzing every part of our system to increase performance - I'm just looking for guidance as to where I should start.

No comments:

Post a Comment

Search This Blog