Discontinued Magazine Index

The index is gone in case anyone here has used it. I have used this site quite a lot. It will be missed.

http://index.mrmag.com/tm.exe?tmpl=tm_faq

Rich

Yep, that's the way we did it

Yep, that's the way we did it in Database school!  We started with the diamond-oval-square layout [I think it's 'flowchart" layout, and it's really good for mapping through a maze] and then converted it to ERD format as informaiton in tables.  The ERD works wonders when you go to converting your chart into tables, as it helps organize the tuples.

I wish i had my instructor's skill, he could have this site knocked out [from a simple blank textpad file] to a finished product [interactive website complete with priveleges and a number of pages for alternative viewing parameters in under an hour.  And the whole time he'd tell us how easy it was.  Meanwhile, we were sitting there with our jaws ajar, watching in horror at how far we'd have to go to get where he was.

The main strength of the ERD is that the relationships are stated with the tables - that is, one issue has many articles, but one article has only one issue - unless it is published more than once - I've run into this situation!  One user had one account...and so forth.

From the table format, you can then ssign primary keys, alternate keys, and other improtant database information that really isn't as important for the end user, but vital for the topend database engineer.

My favorite database, by the way, is excel.  One table, infinite in both directions [almost], and every field has infinate length [almost].  Ten seconds through Access [ugly program] and you can suck those tables out of excel and pump them into access, and from there, you can pump them into an SQL environment on an Apache server, where life gets so much easier once Access no longer exists!  At least, that's my take!  You probably are very familiar with all of this already, but...sigh, I miss grad school for this class.  It was a lot of fun!

 

 

---------------------------------------------------------------------------------------------------------

Benny's Index or Somewhere Chasing Rabbits

jwhitten's picture

Gender / Homographs / Location Algorithms

Perl has modules for that sort of thing, plus the soundex and other algorithm you mention. Check out CPAN.ORG for more info:

Text-GenderFromName

Lingua::EN::Inflect

(Just to point out a couple. Granted the first one is recent, but I'm pretty sure I've used an earlier version of it, or something like it a couple of years ago on a project).

 

John

Modeling the South Pennsylvania Railroad ("The Hilltop Route") in its final days of steam. Heavy patronage by the Pennsy and Norfolk & Western. Coal, sand/gravel/minerals, wood, coke, light industry, finished goods, dairy, mail and light passenger service. Interchanges with the PRR, N&W, WM and Montour.

My favorite database, by the

My favorite database, by the way, is excel.  One table, infinite in both directions [almost], and every field has infinate length [almost].  Ten seconds through Access [ugly program] and you can suck those tables out of excel and pump them into access, and from there, you can pump them into an SQL environment on an Apache server, where life gets so much easier once Access no longer exists! 

Seems a bit convoluted. Do you put all your data into Excel and then process it through to the SQL environment? How do you handle entry of new records and corrections on the existing ones?

I use a data entry/edit screen, an HTML page generated by PHP, where everything goes right into SQL.

Rod

Rod Goodwin
IndexGuy
Skype: IndexGuy1

Developer and moderator of The Railroad Index,
the most effective model railroad index on the Internet!

 

There's two ways to add data

There's two ways to add data to a database:

You can use the "add a record" feature to add one record, one at a time.  You can program this same feature to load the record you wish to edit, and eidt the records, one record at a time.

Alternatively, you can load the records uing a batch file - essentially a set of data in an organized format.  Excel provides data in this manner - or at least, access does. 

The benefits of access is that I can take a dataset that appears in tabular format in a book [like MR's "Index of Plans"], scan it [it's somehting like 98 pages, if I remember right], OCR the scans and the paste the results of a copy into Excel.  I sucessfully migrated this index into excel over the course of a week - the eidting was atrocious, but it would have been months if I had to enter all the data by hand!

Now if I wanted to convert that index into a database, I could simply iron that Excel spreadsheet though Access and add it into the table of a pre-existing database, add the data to the current table data, and then reload the table into the access/MySQL database.  End result, I can load a very large amount of data [thousands and tens of thousands of records] in a very short amount of time [as much time as it would take to load one or two records]!

I would still then be able to add and edit records on the frontside of the system like normal - via a standard web-portal administrative page.

So there's frontside and backside record entry - frontside is good for a low number of records or a low number of changes, while backside is good for uploading gargantuan piles of data, or even changing field layouts and many entry values simultaneously. [As in, we decide to call "Stations", "Depots," and then a week later decide "Depots" are better called "Stations." And then someone smarter than all of us finds a way to integrate "Depot" "Station" "Passenger" "Freight" all together into a subset of fields that makes so much sense we all celebrate with glee!  But that's another challenge for another day!]

I'm a huge fan of ERDs, if you can't tell already - but you need a solid 'Client interview' in order to NAIL the entities and relationships in the ERD - the client interview is a simple statement of what is in the databse, who uses it, how they use it, and the entities detailed within the database.  A simple one is as follows:

"Articles are are written by an Author or many Authors.  Some subjects covered by Articles include Model Construction, Layout Exposees, Prototype Discussions, and [...].   Articles have Titles, Subtitles, Text, Images, Illustrations, Trackplans.  Images, Illustrations, and [...].  Trackplans may be submitted by other co-contirubutors who are not the Author.  The layout owner is not necessarily the Layout Author.  Articles are published in publications.  They may be part of magazines, or they can be contained within special issues, or they may be other publications like books or journals.  Publications have editors, publishers, issues, months, years, and [...].  One Publishing company might have more than one location, as they move throughout the yearsAnd so on."

From this "Interview" I can see that we need tables for "People," "Articles," "Publications" and "Publishing Companies."  We could also have a subtable for "ArticleParts" and a subtable for "PeopleCareer."  But that's all in the pudding and more than I want to think about at this moment!  I'll go home and think about this properly on a plain sheet of paper later!!

---------------------------------------------------------------------------------------------------------

Benny's Index or Somewhere Chasing Rabbits

Suggested Keyword List

Hi there,

today I've uploaded my suggestion for a keyword list to:

http://home.vrweb.de/martin_fischer/keyword.html

As discussed earlier this goes under the assumption that a fixed list of keywords is used. Finding words in titles etc. would be done with a fulltext search.

Any suggestions and additions  etc. welcome.

Regards

Martin

Re-printing of articles

Benny,

I like the way how you describe the data model in plain English. In fact by reading the text I noticed on oversight on my part:

Articles are published in publications.  They may be part of magazines, or they can be contained within special issues, or they may be other publications like books or journals.

Especially Kalmbach does a lot of reprinting, articles appear in  Model Railroader and are later re-issued in one of their books or a free supplement. My data model so far doesn't handle this, but I think it should. Looks like I need to add another entity (table).

Regards

Martin

Re: Suggested Keyword List

Martin;

My effective "keywords" for the example Benny gives are as follows:

freight station
freight stations
freight-station
freight-stations
freight depot
freight depots
freight-depot
freight-depots
freight house
freight houses
freight-house
freight-houses
freighthouse
freighthouses
freight terminal
freight terminals
freight-terminal
freight-terminals
freight terminus
freight termini
freight-terminus
freight-termini
freight shed
freight sheds
freight-shed
freight-sheds

passenger station
passenger stations
passenger-station
passenger-stations
passenger depot
passenger depots
passenger-depot
passenger-depots
passenger shelter
passenger shelters
passenger-shelter
passenger-shelters
passenger terminal
passenger terminals
passenger-terminal
passenger-terminals
passenger terminus
passenger termini
passenger-terminus
passenger-termini
passenger shed
passenger sheds
passenger-shed
passenger-sheds
depot    (defaults to passenger)
depots    (defaults to passenger)
station   (defaults to passenger)
stations  (defaults to passenger)

This looks like a lot, but it comes from the experience of watching not only what people search for, but how they think. The book, "Don't make me think!", is true to a point. However, they DO think, and not necessarily like we do. So if we want to help them find what they search for, we have to bend our thinking to fit theirs.

Rod

Rod Goodwin
IndexGuy
Skype: IndexGuy1

Developer and moderator of The Railroad Index,
the most effective model railroad index on the Internet!

 

Possible Data Model

Martin;

Going back to your initial database design, you have a "series" entity. A series is often the building of a beginner's railroad, and usually the only things common are the title, the name of the railroad and the author. A fairly typical series might be:
part 1 - the track plan;
part 2 - benchwork;
part 3 - roadbed and track;
etc.
If I am looking for benchwork articles, I couldn't care less that it is part of a series. Just show me how to do benchwork.

Also I recently came across a little anomaly. I found a series on resin casting where the title changed between the 2nd and 3rd parts. That would be a problem with the table as shown.

The database is getting more complicated all the time, and I wonder about the value of the "series" table.

Rod

Rod Goodwin
IndexGuy
Skype: IndexGuy1

Developer and moderator of The Railroad Index,
the most effective model railroad index on the Internet!

 

Keywords and Synonyms

Rod,

as mentioned in my earlier posts, I expect the keyword to be added by the one who enters the article into the database. Therefore the list for him should be as short as possible. We should agree on one or two of the most common words for an specific item. My suggestion in this case is freight house and station.

(BTW I'm not a native English speaker, so if other words would be more correct, let me know)

BTW a good part of the list was taken from this site:

www.modelrailwayjournal.com/tags.php

A tag cloud is another cool way to help people find what they're looking for. This also relies on having a short keyword list.

The list you suggest are synonyms for one possible keyword. This could be shorted a bit.

- remove all punctuation

- reduce the search expression to the stem of the word (see earlier post on Perl modules)

- search by synonyms as you suggested.

Regards

Martin

Handling series

Rod,

The database is getting more complicated all the time, and I wonder about the value of the "series" table.

My favorite magazine (the Narrow Gauge Gazette) often has series of articles with related content, like "30 Inch Railroads of North America". In that case I often found it useful to get a list of all pieces with one click.

The anomaly you mentioned wouldn't be a big problem. The series has a name (probably something like "Resin Casting") stored in the series table. Every part has its own title stored in the article table. The series title can't change, the individual title and sub-title could.

I don't think that the data model is overly complex by now. Personally I think it's almost done from my perspective. Only the piece with the reprints needs to be added.

And then I'll see whether anybody shows up to actually implement it.

Regards

Martin


>> Posts index


Journals/Blogs

Recent Blog posts: