Field data type / validations


I have few questions on what data types to use and how to define some fields from my site. My current schema is in MySQL but in process of changing to PostregSQL.

  1. First & Last name -> Since I have multi-lang, tables all support UTF-8, but do i need to declare them as nvarchar in-case a user enters a Chinese name? If so, how do i enforce field validation if it is set to accept alphabets only as i assume those are English alphabets and not validating for valid chinese or arabic alphabets? And i don't think PostregSQL supports nvarchar anyways?

  2. To store current time line - > Example I work in company A from Jan 2009 to Present. So i assume there will be 3 field for this: timeline_to, timeline_from, time_line present where to & from are month/year varchars and present is just a flag to set the current date?

  3. User passwords. i am using SHA 256 + salting. so i have 2 fields declared as follows:
    password_hash - varchar (64)
    password_salt- varchar (64)
    Does this work if the user password needs to be between 8 and 32 chars long?

  4. birth time -> I need to record birth time for the application to calculate some astrological values. so that means hour, minute and am/pm. So best to store these are 3 separate single select lists with varchar or use a time data type in the back end and allow users to use single select list in front end?

  5. Lastly for birth month and year only, are these int or varchar if i store them in separate rows? They all have primary keys of int for reporting purposes so int makes more sense? or should i store them in 1 field only as date type?


    • Never make anything variable that you can make fixed; it is an added burden to pack/unpack on every access. Which means never, ever use var for indexed columns, you will have a very sluggish index. Disk space is cheap these days.

    • you need a Language column at the Person level that tells you what language to use in your various parsing and validation requirements.

  2. Let's say you have Person, Employer, and Employment tables. The columns you discuss are in Employment.

    • you need a StartDate column and EndDate column, they are DATETIME datatype.

    • You do not need "present" as a separate column. "Present" is always the value of the newest Employment row, unless set to something different. Set a Default of the highest date the db can handle, eg. 9999-12-31; which can be overridden by an explicit entry.

  3. No. You only need one CHAR(256) column. Hank has explained it.

  4. For any component of a date or time, use the DATETIME datatype. That is what it is there for. The database handles it consistently, and indexes it perfectly. You perform DATE arithmetic on it, using db various functions(). And you avoid all the problems of coding it as INTs, etc (no invalid dates or times allowed).

  5. BirthDateTime is one DATETIME column.