Monday, July 15, 2013

[how to] How to choose a collation for international database?

[how to] How to choose a collation for international database?


How to choose a collation for international database?

Posted: 15 Jul 2013 07:38 PM PDT

I'm designing a database which will store data in different languages (using UTF-8), so I think the best way to display the queries results is ordering it according to the user's language during the query itself (because there are more than one correct ways to do that), as follows:

SELECT a < b COLLATE "de_DE" FROM test1;  

Assuming this is the correct way to work with international data, which is the best collation for the database itself? PostgreSQL documentation says:

The C and POSIX collations both specify "traditional C" behavior, in which only the ASCII letters "A" through "Z" are treated as letters, and sorting is done strictly by character code byte values.

I think this is the best choice in this case, or am I wrong?

(Bonus question: is it too slow to select the collation in the query itself?).

How to set up users SQL 2008 express over workgroup? (no domain)

Posted: 15 Jul 2013 04:28 PM PDT

The situation is this: Simple network, one server not a domain controller, running SQL 2008 express (64bit). 8 workstations running windows 7 pro connected to it. Last week after Windows updates finished up on the workstations and server, one workstation was denied log on access to the SQL server. It can see the server and files stored on it, access those file but not log on to the SQL server and it's DB for our CRM software.

I set this network up over a year ago, upgraded from a Windows server 2003 running the SQL 2005 Express on a domain (AD on a separate box) to the current stand alone Server 2008 R2 (no domain) and moved the DB over and figured out how to allow access by creating user accounts with passwords on the server. Use the same user names and that password when connecting to the server and everything was fine.

So why after windows updates does its thing would one work station be unable to log in again? (Yes I have uninstalled those updates and tried re-connecting, no luck.)

Given that I had done this before in the past, what am I missing this time around?

I read up the 'suggested questions that may have your answer' while writing this and all of the answers are things I have already done/considered with no positive result.

I have created in SQL a new user name with password, created matching windows account for them on the server with same password, reset the workstation's machine and user name to match, went into control panel\credentials and reset those forcing a fresh log into the server to pass those credentials and that all verifies when I log onto the server in Windows Explorer, it asks for the ServerName\UserName + password, I enter that, it accepts it. Then when trying to connect the SQL server, it fails and the message is 'make sure you have the proper credentials etc...'

I am certain I am over looking one 'little tiny devil of a detail' so I am hoping someone here can assist in uncovering it.

When using SQL Management Studio, I see: ServerName\InstanceName, Authentication is set to Windows Authentication for when I log into it.

The server itself has mixed mode SQL\Windows Authentication enabled. Named pipes are setup as well as TCP/IP. (All other workstations can log in and connect)

When I create the new user name it is passed through to the instance DB and it has the same parameters as all the other accounts but it is still unable to log onto the server.

I feel as though I missing a fundamental 'how to set up user access to SQL using mixed mode authentication' step.

I appreciate everyone's time in reading this as well as any and all assistance given.

Khan

Updateable Subscriptions: Rows do not match between Publisher and Subscriber

Posted: 15 Jul 2013 01:52 PM PDT

I have transactional replication with updatable subscribers set up in SQL Server 2008. It has 3 subscribers and 1 publisher. I had to setup replication again due to some errors related to the database and the application which uses the database. However, now I run into issues when I try updating a section in my application. It does not go through the with update and gives the following error:

Execption Message: Updateable Subscriptions: Rows do not match between Publisher and Subscriber. Run the Distribution Agent to refresh rows at the Subscriber. Updateable Subscriptions: Rolling back transaction. The transaction ended in the trigger. The batch has been aborted. Exception Triggered by Method: OnError

The update statement obviously doesn't go through. However, when I try it the second time it works. Replication is working. Everything seems to be replication.

Can anyone explain why this error would occur and how I can resolve this issue. I would really appreciate the help!...

help in designing reporting database

Posted: 15 Jul 2013 01:02 PM PDT

We are planning to move our detailed report functionality (generating multiple columns output of PDF/XLS with group by, order by etc) from a transactional database to offline reporting database. These reports can be adhoc, too.

In light of this, we are considering transferring data from transactional db to multiple flat tables and denormalized tables to avoid joins and improve query time. But we need an expert's opinion on this. Is this a good approach?

There has been a debate between having one big flat table vs multiple flat tables. I feel multiple flat tables option is better. Any suggestions?

If this is an option for detailed reports, how to achieve summary data reports? Data cubes? I do not know much about Cubes.

Error using sp_trace_create with parameters

Posted: 15 Jul 2013 02:34 PM PDT

I'm running the following SQL on SQL Server 2000:

exec sp_trace_create @options = 2,           @tracefile = N'h:\\trace.trc',           @filecount = 2  

and am getting the following error and can't understand why:

Procedure expects parameter '@tracefile' of type 'nvarchar(128)'

If I pass all of the parameters in like so:

declare @id int

exec sp_trace_create @id output,           @options = 2,           @tracefile = N'h:\\trace.trc',           @filecount = 2,           @maxfilesize = 5,           @stoptime = '2013-8-15'  

I get the error:

Procedure or function sp_trace_create has too many arguments specified.

Replicated Database Log File Maintenance

Posted: 15 Jul 2013 02:08 PM PDT

I have a database on the publisher that is involved in replication (publication configured for merge and transaction). Trying to regain control of the log file for this particular database (VLF count, size, etc.).

Is there anything I need to do (or be cautious of) with the replication setup before trying to perform any maintenance on the log file? I am not an expert in the area of replication and cannot find anything solid that provides guidance as to what measures should be taken.

Edit: This would include working on the distribution database as well, data retention was not configured at all for some reason.

Help creating this query

Posted: 15 Jul 2013 03:36 PM PDT

I'd like to integrate the following query:

SELECT parent.id , COUNT(child.id) AS child_count FROM messages parent INNER JOIN messages child ON child.parent_id = parent.id WHERE parent.parent_id = 0 GROUP BY parent.id;

INTO this query:

SELECT m.id,m.seen, m.uid, m.hash, m.date_created,m.subject, m.textplain,m.texthtml, CONCAT_WS('@', a.localpart, a.domain ) AS address FROM messages m LEFT JOIN message_address_fields maf ON maf.message_id = m.id LEFT JOIN addresses a ON a.id = maf.address_id WHERE maf.field_id =4 AND m.user_id =1

Here's a sql fiddle: http://sqlfiddle.com/#!2/bef27/1

Here's my schema:

 CREATE TABLE IF NOT EXISTS `addresses` (     `id` int(10) NOT NULL AUTO_INCREMENT,     `user_id` int(10) NOT NULL,     `name` text NOT NULL,     `localpart` text NOT NULL,     `domain` text NOT NULL,     PRIMARY KEY (`id`)    ) ENGINE=MyISAM  DEFAULT CHARSET=latin1;    CREATE TABLE IF NOT EXISTS `fields` (     `id` int(10) NOT NULL AUTO_INCREMENT,     `name` text,    PRIMARY KEY (`id`)   ) ENGINE=MyISAM  DEFAULT CHARSET=latin1 AUTO_INCREMENT=12 ;    CREATE TABLE IF NOT EXISTS `messages` (    `id` int(10) NOT NULL AUTO_INCREMENT,    `user_id` int(10) NOT NULL,    `account_folder_id` int(10) NOT NULL,    `hash` varchar(255) NOT NULL,    `subject` varchar(255) NOT NULL,    `texthtml` text NOT NULL,    `textplain` text NOT NULL,    `uid` int(10) NOT NULL,    `seen` tinyint(1) NOT NULL,    `flagged` tinyint(1) NOT NULL,    `date_created` int(11) NOT NULL DEFAULT '0',    `last_modified` int(11) NOT NULL DEFAULT '0',    PRIMARY KEY (`id`),    UNIQUE KEY `hash` (`hash`)   ) ENGINE=MyISAM  DEFAULT CHARSET=latin1;    CREATE TABLE IF NOT EXISTS `message_address_fields` (    `id` int(10) NOT NULL AUTO_INCREMENT,    `message_id` int(10) NOT NULL,    `field_id` int(10) NOT NULL,    `address_id` int(10) NOT NULL,    `date_created` int(11) NOT NULL DEFAULT '0',    `last_modified` int(11) NOT NULL DEFAULT '0',    PRIMARY KEY (`id`)  ) ENGINE=MyISAM  DEFAULT CHARSET=latin1;  

Thanks for the help!

QUESTION: This query limits the data to only messages that contain an address with field id of 4. How would I make it optional to have a correlated address with a field id of 4 or not

SSIS organization

Posted: 15 Jul 2013 04:16 PM PDT

I'm familiar with SSMS, but I've never used .net, c# or visual studio (been doing other things: Assembler, c, unix, vi, oracle, etc).

I'm trying to figure out basic 2012 SSIS ETL (no BI or data warehouse) and can't find anything that explains how all of these fit together:

catalog
solution
project
package any others?

Say I just want to build a simple test using SSIS, import a file into a table, which of the above are required, and in what order do I create them?

Also, do you have any good links to basic SSIS 2012 ETL overviews and/or how to articles.

How to avoid empty rows in SSIS Excel Destination?

Posted: 15 Jul 2013 06:09 PM PDT

Does anyone have a way to avoid empty rows when using SSIS to export to Excel. Here's a simple example of one data flow task:

OLE DB Source:

OLE DB Source:

Data Conversion (to handle the annoying UNICODE / NON-UNICODE deal):

Data Conversion

The end result is either of the two below depending on value of "FirstRowHasColumnName" in the Excel Connection Manager. Note, the blank rows.

output 1

output 2

mySql Event gets expired on closing of the command prompt

Posted: 15 Jul 2013 07:20 PM PDT

As I am quite new to server side, I tried to create a event using

CREATE EVENT enetname ON SCHEDULE AT '2013-06-13 18:00:00'+INTERVAL 30 SECOND DO update tablname set status= 'Expired' where expirydate < CURDATE();

This event is creating updsting the field till the time the mysql cmdprompt is open.

When I am closing the command prompt the event gets dropped/expired, where as my event scheduler is running and waiting fr next activation.

Please tell me how to avoid the dropping of event.

how to change a user password while checking the current password in oracle?

Posted: 15 Jul 2013 07:25 PM PDT

In my application I have a change password form which users enter their current password and the new password. Then I change their password using the following command:

alter user user_name identified by new_password;  

My problem is that I don't know how to check whether the current password which the user has been entered in the form is correct or not ( as the passwords are encrypted in oracle)

How should I do this?

strange io in mysql server

Posted: 15 Jul 2013 07:22 PM PDT

One of mysql servers becomes slow recently, and I found that there is a high IO in the server:

$ sudo  pt-ioprofile -cell sizes  Mon Apr 29 22:39:32 CDT 2013  Tracing process ID 16934       total      pread       read     pwrite      fsync       open      lseek filename  2147483647    1735082      64640     369248          0          0 2147483647 /var/lib/mysql/db1/estate.MYD   889808476     524176      19420     128436          0          0  889136444 /var/lib/mysql/db2/estate.MYD   343733731       7231        600          0          0          0  343725900 /var/lib/mysql/db2/cmsuser.MYD    18334349       3325        180          0          0          0   18330844 /var/lib/mysql/db1/cmsuser.MYD      104076          0          0     104076          0          0          0 /var/lib/mysql/db1/estate.MYI       98304          0          0      98304          0          0          0 /var/lib/mysql/ibdata1       34300          0          0      34300          0          0          0 /var/lib/mysql/db2/estate.MYI        1024          0          0       1024          0          0          0 /var/lib/mysql/ib_logfile1         512          0          0        512          0          0          0 /var/lib/mysql/ib_logfile0           0          0          0          0          0          0          0 /etc/hosts.deny           0          0          0          0          0          0          0 /etc/hosts.allow  

However, when I enter show process in the server, I found no active process!

What's possibly going on here?

MSDTC on SQL Server 2012 with mount point for data disk

Posted: 15 Jul 2013 07:22 PM PDT

I have installed a new SQL Server 2012 instance on a cluster. The installation went fine but when I've added the MSDTC as resource of the instance, it went wrong. When I just added it, the MSDTC was online but when I tried my first failover, it refused to go online. The error message was "MSDTC failed".

My configuration is: a physical drive disk (let say J:) with two mount points "DATA" and "LOG".

I read some people who experienced this kind of errors with mount points and SQL Server 2008 but nothing with 2012.

I tried to create the "MSDTC" folder on the physical drive (the J: one) but nothing better.

Does anyone know something about it or know how to configure the MSDTC and force it to use the J: drive?

How to disable oracle's MAX_ENABLED_ROLES limit

Posted: 15 Jul 2013 02:09 PM PDT

How to disable oracle's MAX_ENABLED_ROLES limit or expand the value of limitation. [oracle 10g (win32)]

Oracle error handling strategy [duplicate]

Posted: 15 Jul 2013 07:24 PM PDT

This question already has an answer here:

I am developing a project using oracle. I am writing functions and stored procedures to handle CRUD statements.

My question is in addition to oracle check constraints do I have to check error situations myself or let oracle do the job?

For example if I have a column in a table which is unique and I want to insert into this table I have to check the value to be unique or let the oracle unique constraint do the job and produce an error if it is a repeated value?

Is it possible to use Oracle XMLTABLE and %type column type selector?

Posted: 15 Jul 2013 07:23 PM PDT

I am creating a stored procedure which gets an xml input and inserts the data to a table using XMLTABLE.

I want to know how can i specify a field type according to a table column type when passing columns in XMLTABLE. (using something like %type).

Here is my simplified code:

SELECT tmp.*  FROM (SELECT XMLTYPE(xml_in) AS xml FROM dual),        XMLTable('for $i in /rowset/row return $i'                  PASSING xml COLUMNS                    name MYTABLE.name%TYPE path 'name', // %TYPE just doesnt work here :(                    family MYTABLE.family%TYPE path 'family'                ) tmp  

What is the best way to insert a XML input to an oracle table in stored procedure?

Posted: 15 Jul 2013 07:24 PM PDT

I want to write a stored procedure which has a XML input (clob type) and inserts the data to a table. my XML structure is something like this:

<rowset>      <row>          <name>tom</name>          <family>mack</family>          ....      </row>       <row>          <name>marry</name>          <family>skot</family>          ....      </row>       ...  </rowset>  

I have just stock in many options that oracle gave me to handle this. using extractvalue and XMLSequence- using DBMS_XMLDOM- using DBMS_XMLStore or many other options.

I want to know which one is better according to the performance and resources that use. the input XML is usually less than 50KB.

Do roles have an owner in oracle?

Posted: 15 Jul 2013 07:26 PM PDT

Suppose that I have logged in to oracle database using user X, then user X creates some roles using create role command. I want to know that is user X the owner of the roles? Can I select all roles created by user X?

effective mysql table/index design for 35 million rows+ table, with 200+ corresponding columns (double), any combination of which may be queried

Posted: 15 Jul 2013 03:09 PM PDT

I am looking for advice on table/index design for the following situation:

i have a large table (stock price history data, InnoDB, 35 million rows and growing) with a compound primary key (assetid (int),date (date)). in addition to the pricing information, i have 200 double values that need to correspond to each record.

CREATE TABLE `mytable` (  `assetid` int(11) NOT NULL,  `date` date NOT NULL,  `close` double NOT NULL,  `f1` double DEFAULT NULL,     `f2` double DEFAULT NULL,  `f3` double DEFAULT NULL,     `f4` double DEFAULT NULL,   ... skip a few …  `f200` double DEFAULT NULL,   PRIMARY KEY (`assetid`, `date`)) ENGINE=`InnoDB` DEFAULT CHARACTER SET latin1 COLLATE      latin1_swedish_ci ROW_FORMAT=COMPACT CHECKSUM=0 DELAY_KEY_WRITE=0       PARTITION BY RANGE COLUMNS(`date`) PARTITIONS 51;  

i initially stored the 200 double columns directly in this table for ease of update and retrieval, and this had been working fine, as the only querying done on this table was by the assetid and date (these are religiously included in any query against this table), and the 200 double columns were only read. My database size was around 45 Gig

However, now i have the requirement where i need to be able to query this table by any combination of these 200 columns (named f1,f2,...f200), for example:

select from mytable   where assetid in (1,2,3,4,5,6,7,....)  and date > '2010-1-1' and date < '2013-4-5'  and f1 > -0.23 and f1 < 0.9  and f117 > 0.012 and f117 < .877  etc,etc  

i have not historically had to deal with this large of an amount of data before, so my first instinct was that indexes were needed on each of these 200 columns, or i would wind up with large table scans, etc. To me this meant that i needed a table for each of the 200 columns with primary key, value, and index the values. So i went with that.

CREATE TABLE `f1` (  `assetid` int(11) NOT NULL DEFAULT '0',  `date` date NOT NULL DEFAULT '0000-00-00',  `value` double NOT NULL DEFAULT '0',  PRIMARY KEY (`assetid`, `date`),  INDEX `val` (`value`)  ) ENGINE=`InnoDB` DEFAULT CHARACTER SET latin1 COLLATE latin1_swedish_ci ROW_FORMAT=COMPACT CHECKSUM=0 DELAY_KEY_WRITE=0;  

i filled up and indexed all 200 tables. I left the main table intact with all 200 columns, as regularly it is queried over assetid and date range and all 200 columns are selected. I figured that leaving those columns in the parent table (unindexed) for read purposes, and then additionally having them indexed in their own tables (for join filtering) would be most performant. I ran explains on the new form of the query

select count(p.assetid) as total   from mytable p   inner join f1 f1 on f1.assetid = p.assetid and f1.date = p.date  inner join f2 f2 on f2.assetid = p.assetid and f2.date = p.date   where p.assetid in(1,2,3,4,5,6,7)  and p.date >= '2011-01-01' and p.date < '2013-03-14'   and(f1.value >= 0.96 and f1.value <= 0.97 and f2.value >= 0.96 and f2.value <= 0.97)   

Indeed my desired result was achieved, explain shows me that the rows scanned are much smaller for this query. However i wound up with some undesirable side effects.

1) my database went from 45 Gig to 110 Gig. I can no longer keep the db in RAM. (i have 256Gig of RAM on the way however)

2) nightly inserts of new data now need to be done 200 times instead of once

3) maintenance/defrag of the new 200 tables take 200 times longer than just the 1 table. It cannot be completed in a night.

4) queries against the f1, etc tables are not necessarily performant. for example:

 select min(value) from f1    where assetid in (1,2,3,4,5,6,7)    and date >= '2013-3-18' and date < '2013-3-19'  

the above query, while explain shows that it lookgin at < 1000 rows, can take 30+ seconds to complete. I assume this is because the indexes are too large to fit in memory.

Since that was alot of bad news, I looked further and found partitioning. I implemented partitions on the main table, partitioned on date every 3 months. Monthly seemed to make sense to me but i have read that once you get over 120 partitions or so, performance suffers. partitioning quarterly will leave me under that for the next 20 years or so. each partition is a bit under 2 Gig. i ran explain partitions and everything seems to be pruning properly, so regardless i feel the partitioning was a good step, at the very least for analyze/optimize/repair purposes.

I spent a good deal of time with this article

http://ftp.nchu.edu.tw/MySQL/tech-resources/articles/testing-partitions-large-db.html

my table currently is partitioned with primary key still on it. The article mentions that primary keys can make a partitioned table slower, but if you have a machine that can handle it, primary keys on the partitioned table will be faster. Knowing i have a big machine on the way (256 G RAM), i left the keys on.

so as i see it, here are my options

Option 1

1) remove the extra 200 tables and let the query do table scans to find the f1, f2 etc values. non-unique indexes can actually hurt performance on a properly partitioned table. run an explain before the user runs the query and deny them if the number of rows scanned is over some threshold i define. save myself the pain of the giant database. Heck, it will all be in memory soon anyways.

sub-question:

does it sound like i have chosen an appropriate partition scheme?

Option 2

Partition all the 200 tables using the same 3 months scheme. enjoy the smaller row scans and allow the users to run larger queries. now that they are partitioned at least i can manage them 1 partition at a time for maintenance purposes. Heck, it will all be in memory soon anyways. develop efficient way to update them nightly.

sub-question:

do you see a reason that i may avoid primary key indexes on these f1,f2,f3,f4... tables, knowing that i always have assetid and date when querying? seems counter intuitive to me but i am not used to data sets of this size. that would shrink the database a bunch i assume

Option 3

Drop the f1,f2,f3 columns in the master table to reclaim that space. do 200 joins if i need to read 200 features, maybe it wont be as slow as it sounds.

Option 4

You all have a better way to structure this than i have thought of so far.

* NOTE: i will soon be adding another 50-100 of these double values to each item, so i need to design knowing that is coming

thanks for any and all help

Update #1 - 3/24/2103

I went with the idea suggested in the comments i got below and created one new table with the following setup:

create table 'features'{    assetid int,    date    date,    feature varchar(4),    value   double  }  

I partitioned the table in 3 month intervals.

I blew away the earlier 200 tables so that my database was back down to 45 Gig and started filling up this new table. A day and a half later, it completed, and my database now sits at a chubby 220 Gigs!

It does allow the possibility of removing these 200 values from the master table, as i can get them from one join, but that would really only give me back 25 Gigs or so maybe

I asked it to create a primary key on assetid, date,feature and an index on value, and after 9 hours of chugging it really hadn't made a dent and seemed to freeze up so i killed that part off.

i rebuilt a couple of the partitions but it did not seem to reclaim much/any space.

So that solution looks like it probably isn't going to be ideal. Do rows take up significantly more space than columns i wonder, could that be why this solution took up so much more space?

I came across this article

http://www.chrismoos.com/2010/01/31/mysql-partitioning-tables-with-millions-of-rows

it gave me an idea.

where he says

"At first, I thought about RANGE partitioning by date, and while I am using the date in my queries, it is very common for a query to have a very large date range, and that means it could easily span all partitions."

Now i am range partitioning by date as well, but will also be allowing searches by large date range, which will decrease the effectiveness of my partitioning. I will always have a date range when i search, however i will also always have a list of assetids. Perhaps my solution should be to partition by assetid and date, where i identify typically searched assetid ranges (which i can come up with, there are standard lists, S&P 500, russell 2000, etc). this way i would almost never look at the entire data set.

Then again, i am primary keyed on assetid and date anyways, so maybe that wouldnt help much.

any more thoughts/comments would be appreciated

thanks

postgres backup / restore: restored database much smaller?

Posted: 15 Jul 2013 07:21 PM PDT

I am worried that my restored database is very different from the original:

#check size of postgres database  postgres@db1:/tmp$ psql -c "select pg_size_pretty(pg_database_size('test_db'));"   pg_size_pretty  ----------------   2105 MB  (1 row)    #backup database  postgres@db1:/tmp$ pg_dump -Fc test_db > test_db_Fc.dump    #rename postgres database (i.e. park it nearby)  postgres@db1:/tmp$ psql -c "alter database test_db rename to test_db_20130322;"  ALTER DATABASE  -------  (1 row)    #restore test_db  postgres@db1:/tmp$ pg_restore -Fc -C -d postgres test_db_Fc.dump    #check size of restored postgres database  postgres@db1:/tmp$ psql -c "select pg_size_pretty(pg_database_size('test_db'));"   pg_size_pretty  ----------------   257 MB  (1 row)  

The original db is many times larger than the restored one. What is going on here? As far as I can tell, the website that test_db serves is still working perfectly well, after a restore, but I need to know what's up before i use a backup in live context.

FYI if I run vacuumdb on each database there seems to be no change in database size.

[Addendum, added later] In the tradition of RTFM I have gone hunting in the manual for PostgreSQL. Here are some more pieces to the puzzle

#is the autovacuum switched on?  postgres@db1:/tmp$ psql -c "SHOW autovacuum;"   autovacuum  ------------   on  (1 row)    #The "track counts" parameter is needed by autovacuum which uses statistics from database activity to know where to do its job  postgres@db1:/tmp$ psql -c "SHOW track_counts;"   track_counts  --------------   on  (1 row)    #is there an autovacuum daemon resident in memory?  postgres@db1:/tmp$ ps  -ef | grep 'autovacuum'  postgres  1261  1021  0 Jan23 ?        00:08:27 postgres: autovacuum launcher process                               postgres 18347 18149  0 00:33 pts/0    00:00:00 grep autovacuum      #what's been happening on the live server?  postgres@LIVEdb1:/tmp$ psql -c "SELECT relname, last_vacuum, last_autovacuum, last_analyze, last_autoanalyze from pg_stat_all_tables;"  #result is list of 65 tables (out about 300), all empty (no dates at all)  

Gratefully following up on @craig-ringer's advice to look into VACUUM FULL I turned to PostgreSQL documentation, (paraphrased) "... In the worst case where your disk is nearly full, VACUUM FULL may be the only workable alternative. ... (but) the usual goal of routine vacuuming is to avoid needing VACUUM FULL. The autovacuum daemon attempts to do standard VACUUMs often enough to maintain steady-state usage of disk space ..."

I will follow @chris-travers advice and map out the number of rows in tables from each version of the database. I think in my case it'd be fair to say that VACUUM FULL will relieve the pressure on disk space, and it'll make original_vs_restored look better but there's still this red flag of excessive bloat. I think autovacuum isn't doing anything and that's a worry! Thanks for the guidance so far, it's fascinating.

MySQL Replication using SSL

Posted: 15 Jul 2013 08:09 PM PDT

I am in the process of replicating my database so i can have a master slave configuration, one of the issues i have is with security i am basically generating my server/client keys and certificates using openssl i also generate my own CA key and certificate to self sign, i understand the issues with self signing certificates on a public website, but do you think this will be as a serious problem when used in replication?

Finding out the hosts blocked by mysql server

Posted: 15 Jul 2013 07:09 PM PDT

Can someone tell me how to list the hosts which are blocked by the mysql server due to the reason that they crossed the limit of max_connect_errors. Is there any table in which MySQL server keeps this data. I am using mysql-server-5.1.63

Delete word, its meanings, its meaning's example sentences from DB

Posted: 15 Jul 2013 04:09 PM PDT

I have three tables as below (simplified for demonstration):

words  =====  integer id  text    word    meanings  ========  integer id  integer word_id  text    meaning    examples  ========  integer id  integer meaning_id  text    sentence  

where, word_id stores id of the word in words table and meaning_id stores id of the meaning in meanings table. I am trying to figure out a sql query, given a word's id, to delete the word with all its meanings and example sentences all at one time. Is such sql query possible to compose? If so, how?

Edit1: I am using SQLite3 as the database.

Edit2: I figured the following solution which requires 3 sql queries in order:

DELETE FROM examples WHERE meaning_id IN (SELECT id FROM meanings WHERE word_id=the_given_id);  DELETE FROM meanings WHERE word_id=the_given_id;  DELETE FROM words WHERE id=the_given_id;  

I'm still looking for the answer to my question: is the whole process possible to be done in one query?

INNER JOIN Giving time out on large database

Posted: 15 Jul 2013 07:26 PM PDT

Getting time out on this script

UPDATE       uk_data AS ud    INNER JOIN       uk_pc AS up           ON ud.cat10 = up.WardCode  SET       ud.cat8 = up.Latitude,      ud.cat9 = up.Longitude;  

uk_pc Table has 1,755,213 entries and uk_data has 24,510 entries.

Is there any other way to do this where I don't get time out?

I want to update uk_data cat8 and cat9 with uk_pc Latitude and Longitude.

Table1:  uk_pc      Latitude      Longitude      WardCode    Table2:  uk_data      cat8      cat9      cat10  

in both tables WardCode and cat10 have same value.

Table Definitions

Table1 uk_data

CREATE TABLE IF NOT EXISTS `uk_data` (    `slno` int(100) NOT NULL AUTO_INCREMENT,    `comp_name` varchar(150) DEFAULT NULL,    `comp_no` varchar(50) DEFAULT NULL,    `comp_street` varchar(100) DEFAULT NULL,    `comp_area` varchar(100) DEFAULT NULL,    `comp_post_code` varchar(50) DEFAULT NULL,    `comp_phone` varchar(100) DEFAULT NULL,    `comp_phone1` varchar(100) DEFAULT NULL,    `cat1` varchar(100) DEFAULT NULL,    `cat2` varchar(100) DEFAULT NULL,    `cat3` varchar(100) DEFAULT NULL,    `cat4` varchar(100) DEFAULT NULL,    `cat5` varchar(100) DEFAULT NULL,    `cat6` varchar(100) DEFAULT NULL,    `cat7` varchar(100) DEFAULT NULL,    `cat8` varchar(100) DEFAULT NULL,    `cat9` varchar(100) DEFAULT NULL,    `cat10` varchar(100) DEFAULT NULL,    PRIMARY KEY (`slno`),    UNIQUE KEY `Phone` (`comp_phone`)  ) ENGINE=MyISAM  DEFAULT CHARSET=latin1 AUTO_INCREMENT=31717 ;  

Table2(uk_pc)

    DROP TABLE IF EXISTS `uk_pc`;  CREATE TABLE `uk_pc` (    `slno` int(11) NOT NULL AUTO_INCREMENT,    `Postcode` varchar(25) DEFAULT NULL,    `Latitude  Latitude` decimal(9,6) DEFAULT NULL,    `Longitude` decimal(9,6) DEFAULT NULL,    `Easting` varchar(25) DEFAULT NULL,    `Northing` varchar(25) DEFAULT NULL,    `GridRef` varchar(25) DEFAULT NULL,    `County` varchar(25) DEFAULT NULL,    `District` varchar(25) DEFAULT NULL,    `Ward` varchar(25) DEFAULT NULL,    `DistrictCode` varchar(25) DEFAULT NULL,    `WardCode` varchar(15) DEFAULT NULL,    `Country` varchar(25) DEFAULT NULL,    `CountyCode` varchar(25) DEFAULT NULL,    PRIMARY KEY (`slno`)  ) ENGINE=MyISAM DEFAULT CHARSET=latin1;  

Do I need an index on this?

Posted: 15 Jul 2013 07:23 PM PDT

I have a sql server db and have a question regarding foreign key indices.

Here's a snippet from the db:

enter image description here

In programme_affiliates, I have a computed field, this is always set to '2' in this instance.

The reason for it is that it maintains integrity between programmes and programmes_affiliates.

I also have a link to programme_types to ensure that '2' is always valid as a programme type.

I believe that you should always provide an index for a foreign key, but as it is a calculated fixed field, is this necessary in this case?

How to modify an update in Oracle so it performs faster?

Posted: 15 Jul 2013 01:09 PM PDT

I have this query:

UPDATE   (      SELECT   h.valid_through_dt, h.LAST_UPDATE_TMSTMP      FROM   ETL_FEE_SCH_TMP d, FEE_SCHEDULE_HISTORICAL h      WHERE       h.FUND_ID = d.FUND_ID      AND h.FEETYPE_NAME = d.FEETYPE_NAME      AND h.BREAKPOINT_TYPE = d.BREAKPOINT_TYPE      AND h.BREAKPOINT_QTY = d.BREAKPOINT_QTY      AND h.LOW_BREAKPOINT_AMT = d.LOW_BREAKPOINT_AMT      AND h.VALID_THROUGH = TO_DATE ('31-DEC-9999', 'dd-mon-yyyy')      AND h.universe = 'DC'      AND h.universe = d.universe      AND EXISTS      (          SELECT 1          FROM FEE_SCHEDULE s          WHERE s.FUND_ID = h.FUND_ID          AND s.FEETYPE_NAME = h.FEETYPE_NAME          AND s.BREAKPOINT_TYPE = h.BREAKPOINT_TYPE          AND s.BREAKPOINT_QTY = h.BREAKPOINT_QTY          AND s.LOW_BREAKPOINT_AMT = h.LOW_BREAKPOINT_AMT          AND s.universe = 'DC'      )  ) updateTable  SET     updateTable.VALID_THROUGH = (SYSDATE - 1),  updateTable.LAST_UPDATE_TMSTMP = SYSTIMESTAMP;  

The trouble that I am having is that this query takes a long time to run. I don't know whether it is possible to run this on parallel, or it would be easier to update a cursor in a pipeline function.

What would you suggest?

This is all the information that I believe it is relevant.

This is the execution plan of the internal select:

Execution Plan  ----------------------------------------------------------  Plan hash value: 57376096  ---------------------------------------------------------------------------------------------------------  | Id  | Operation                    | Name                     | Rows  | Bytes| Cost (%CPU)| Time     |  ---------------------------------------------------------------------------------------------------------  |   0 | SELECT STATEMENT             |                          |     1 |   306 |  8427   (1)| 00:01:42 |  |   1 |  NESTED LOOPS                |                          |       |       |            |          |  |   2 |   NESTED LOOPS               |                          |     1 |    306|  8427   (1)| 00:01:42 |  |   3 |    MERGE JOIN CARTESIAN      |                          |     1 |    192|  8426   (1)| 00:01:42 |  |*  4 |     INDEX RANGE SCAN         | SYS_C000666              |     1 |     96|     2   (0)| 00:00:01 |  |   5 |     BUFFER SORT              |                          |  3045K|   278M|  8425   (1)| 00:01:42 |  |   6 |      SORT UNIQUE             |                          |  3045K|   278M|  8425   (1)| 00:01:42 |  |*  7 |       TABLE ACCESS FULL      | FEE_SCHEDULE             |  3045K|   278M|  8425   (1)| 00:01:42 |  |*  8 |    INDEX RANGE SCAN          | FEE_SCHDL_IDX1           |     1 |       |     1   (0)| 00:00:01 |  |*  9 |   TABLE ACCESS BY INDEX ROWID| FEE_SCHEDULE_HISTORICAL  |     1 |   114 |     1   (0)| 00:00:01 |  ---------------------------------------------------------------------------------------------------------  Predicate Information (identified by operation id):  ---------------------------------------------------     4 - access("D"."UNIVERSE"='DC')     7 - filter("S"."UNIVERSE"='DC')     8 - access("H"."UNIVERSE"='DC' AND "S"."FUND_ID"="H"."FUND_ID" AND                "S"."FEETYPE_NAME"="H"."FEETYPE_NAME" AND                "S"."BREAKPOINT_TYPE"="H"."BREAKPOINT_TYPE" AND                "S"."BREAKPOINT_QTY"="H"."BREAKPOINT_QTY" AND                "S"."LOW_BREAKPOINT_AMT"="H"."LOW_BREAKPOINT_AMT")         filter("H"."FUND_ID"="D"."FUND_ID" AND                "H"."FEETYPE_NAME"="D"."FEETYPE_NAME" AND                "H"."BREAKPOINT_TYPE"="D"."BREAKPOINT_UNIT_TY  

Table data:

---------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------  UNIVERSE|FUND_ID   |FEETYPE_NAME |BREAKPOINT_TYPE|BREAKPOINT_QTY|LOW_BREAKPOINT_AMT|HIGH_BREAKPOINT_AMT|FEE_PCT|FEE_SCHDL_SEQ_ID|GROUP_ID|LAST_UPDATE_TMSTMP  |VALID_FROM|VALID_THROUGH|INSERT_TMSTMP        |JOB_ID|  ---------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------  DC      |DC9ZTPLPHO|DeferLoad    |Percentage     |4             |10000             |300000             |3.14   |780250          |null    |1/4/2012  3:59:54 PM|6/23/2012 |12/31/9999   |1/5/2011   3:59:54 PM|666   |  DC      |DCE86Y8XFU|RedemptionFee|Percentage     |9             |  100             |100500             |7.67   |780251          |null    |6/4/2012  4:49:54 PM|11/12/2011|12/31/9999   |8/17/2011  2:00:54 PM|666   |  DC      |DCAYL0KONA|FrontLoad    |Percentage     |2             |50000             |601500             |5.00   |780252          |null    |4/25/2012 4:49:54 PM|8/2/2012  |12/31/9999   |12/19/2012 9:59:00 PM|666   |  DC      |DC9ZTPLPHO|DeferLoad    |Percentage     |7             |80000             |900000             |2.24   |780252          |null    |4/25/2012 4:49:54 PM|8/2/2012  |12/31/9999   |12/19/2012 9:59:00 PM|666   |  ---------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------  

This is the script of the historical table:

CREATE TABLE FEE_SCHEDULE_HISTORICAL  (    UNIVERSE                        VARCHAR2(2 BYTE) NOT NULL,    FUND_ID                         VARCHAR2(10 BYTE) NOT NULL,    FEETYPE_NAME                    VARCHAR2(75 BYTE),    BREAKPOINT_TYPE                 VARCHAR2(50 BYTE),    BREAKPOINT_QTY                  VARCHAR2(10 BYTE),    LOW_BREAKPOINT_AMT              NUMBER(19,6),    HIGH_BREAKPOINT_AMT             NUMBER(19,6),    FEE_PCT                         NUMBER(19,6),    FEE_SCHDL_SEQ_ID                NUMBER        NOT NULL,    GROUP_ID                        NUMBER,    LAST_UPDATE_TMSTMP              DATE          NOT NULL,    VALID_FROM                      DATE          NOT NULL,    VALID_THROUGH                   DATE          NOT NULL,    INSERT_TMSTMP                   DATE          NOT NULL,    JOB_ID                          NUMBER        NOT NULL  );    CREATE UNIQUE INDEX FEE_SCHDL_PK ON FEE_SCHEDULE_HISTORICAL(FEE_SCHDL_SEQ_ID);    CREATE UNIQUE INDEX FEE_SCHDL_HST_IDX ON FEE_SCHEDULE_HISTORICAL (      UNIVERSE,      FUND_ID,      FEETYPE_NAME,      BREAKPOINT_TYPE,      BREAKPOINT_QTY,       LOW_BREAKPOINT_AMT,      VALID_FROM,      JOB_ID  )    CREATE INDEX FEE_SCHEDULE_HST_IDX2 ON FEE_SCHEDULE_HISTORICAL(LAST_UPDATE_TMSTMP)    CREATE INDEX FEE_SCHEDULE_HST_IDX3 ON FEE_SCHEDULE_HISTORICAL(VALID_THROUGH)    ALTER TABLE FEE_SCHEDULE_HISTORICAL ADD (      CONSTRAINT FEE_SCHDL_PK      PRIMARY KEY      (FEE_SCHDL_SEQ_ID)  );  

This is the other table:

CREATE TABLE FEE_SCHEDULE  (    UNIVERSE                        VARCHAR2(2 BYTE) NOT NULL,    FUND_ID                         VARCHAR2(10 BYTE) NOT NULL,    FEETYPE_NAME                    VARCHAR2(75 BYTE),    BREAKPOINT_TYPE                 VARCHAR2(50 BYTE),    BREAKPOINT_QTY                  VARCHAR2(10 BYTE),    LOW_BREAKPOINT_AMT              NUMBER(19,6),    HIGH_BREAKPOINT_AMT             NUMBER(19,6),    FEE_PCT                         NUMBER(19,6),    JOB_RUN_ID                      NUMBER        NOT NULL,    FILE_DATE                       DATE          NOT NULL,    CYCLE_DATE                      DATE          NOT NULL  )  

The temporary table is the result of FEE_SCHEDULE_HISTORICAL minus FEE_SCHEDULE

Query to find and replace text in all tables and fields of a mysql db

Posted: 15 Jul 2013 05:09 PM PDT

I need to run a query to find and replace some text in all tables of a mysql database.

I found this query, but it only looks for the text in the tbl_name table and just in the column field.

update tbl_name set column=REPLACE(column, 'fuschia', 'fuchsia');   

I need it to look in all tables and all fields: (everywhere in the database)

Export as csv or xsl file from SQL Server 2008 express R2

Posted: 15 Jul 2013 07:25 PM PDT

I am basically LAMP developer got some assignment where SQL Server is required. I have installed SQL Server 2008 R2 Express and imported backup file in it.

Now, I want to export user table in excel or csv but not able to do that.

Can anyone tell me how can I export table?

MongoDB: queue vs. realtime write/update

Posted: 15 Jul 2013 07:23 PM PDT

On a large mongoDB I have 2 different use-cases for reading and 2 different types for write/update.

  • read 1: find by index -> no question on this. this works like a charm - as long as the index fits in the memory/ram.
  • read 2: read line by line, like a batch script which is reading a log-file one by one.
  • write 1: insert - append a new dataset or document.
  • write 2: update data, directly addresses or found by index.

Now I wonder if mongoDB is able to "priorize" different write-tasks. I have one collection which does not require the data to be written and available instantly in "real-time". It's enough to use a slow queque and process these writing tasks without hurry. However, there are also write statements that should be processed asap and be available with a minimum delay (if any).

Does mongoDB allow me to differ between high (asap) and low prio (queque) writing/updating?

The order of INNER JOIN's

Posted: 15 Jul 2013 07:25 PM PDT

I would like to know if there are any rules about the order of INNER JOIN's in T-SQL 2000.

I had 62x the performance on my query when my INNER JOIN that act like a filter is placed at first instead of the end.

In reference of http://sql-4-life.blogspot.com/2009/03/order-of-inner-joins.html

No comments:

Post a Comment

Search This Blog