每日一練 | Data Scientist & Business Analyst & Leetcode 面試題 1061

2021-03-06 大數據應用

Data Application Lab 自2017年6月15日起,每天和你分享討論一道數據科學(DS)和商業分析(BA)領域常見的面試問題。

自2017年10月4日起,每天再為大家分享一道Leetcode 算法題。

希望積極尋求相關領域工作的你每天關注我們的問題並且與我們一起思考,我們將會在第二天給出答案。

What is cross-validation? How to do it right?

Combine two tables

Table: Person

+---+----+
| Column Name | Type    |
+---+----+
| PersonId    | int     |
| FirstName   | varchar |
| LastName    | varchar |
+---+----+

PersonId is the primary key column for this table.


Table: Address
+---+----+
| Column Name | Type    |
+---+----+
| AddressId   | int     |
| PersonId    | int     |
| City        | varchar |
| State       | varchar |
+---+----+

AddressId is the primary key column for this table.

Write a SQL query for a report that provides the following information for each person in the Person table, regardless if there is an address for each of those people:

FirstName, LastName, City, State

Two Sum

Description:
Given an array of integers, return indices of the two numbers such that they add up to a specific target.
You may assume that each input would have exactly one solution, and you may not use the same element twice

Input: [2, 7, 11, 15]

Output: [0, 1]

Assumptions:  1. each input would have exactly one solution

2. you may not use the same element twice

3. sorted in ascending order

DS Interview Question & Answer

During analysis, how do you treat missing values?

Should we even treat missing values is another important point to consider? If 80% of the values for a variable are missing then you may drop the variable instead of treating the missing values.

Deleting the observations: when your have sufficient data points and your delete will not introduce bias

Imputation with mean / median / mode or set default value

Imputation with some models: KNN, Mice etc.

Use other features to build a model to predict the missing part

...

Reference:

https://www.r-bloggers.com/missing-value-treatment/

BA Interview Question & Answer

Write a query in SQL to Obtain the names of all patients whose primary care is taken by a physician who is not the head of any department and name of that physician along with their primary care physician.

Table: patient (pt)

ssn              |       name                |      address                  |  phone            | insuranceid | pcp
-+++----+-+---
100000001 | John Smith             | 42 Foobar Lane          | 555-0256      |    68476213    |   1
100000002 | Grace Ritchie         | 37 Snafu Drive            | 555-0512      |    36546321    |   2
100000003 | Random J. Patient | 101 Omgbbq Street     | 555-1204      |    65465421    |   2
100000004 | Dennis Doe            | 1100 Foobaz Avenue   | 555-2048     |    68421879    |   3

Table: physician (p)

Employeeid  |     name                   |     position                                  |    ssn
--++---+-
          1         | John Dorian            | Staff Internist                              | 111111111
          2         | Elliot Reid               | Attending Physician                    | 222222222
          3         | Christopher Turk    | Surgical Attending Physician      | 333333333
          4         | Percival Cox          | Senior Attending Physician         | 444444444
          5         | Bob Kelso              | Head Chief of Medicine               | 555555555
          6         | Todd Quinlan          | Surgical Attenian                        | 666666666
          7         | John Wen               | Surgical Attending Physician      | 777777777
          8         | Keith Dudemeister | MD Resident                               | 888888888
          9         | Molly Clock             | Attending Psychiatrist                 | 999999999

Answer: 

SELECT pt.name AS "Patient",
              p.name AS "Primary care Physician"
FROM patient pt
JOIN physician p ON pt.pcp=p.employeeid
WHERE pt.pcp NOT IN
      (SELECT head
      FROM department);

https://www.w3resource.com/sql-exercises/hospital-database-exercise/sql-exercise-hospital-database-39.php

LeetCode Question & Answer

Pascal’s Triangle II

Description:

Given an index k, return the kth row of the Pascal’s triangle.

Input: 3

Output: [1,3,3,1]

Assumptions:

Could you optimize your algorithm to use only O(k) extra space?

Solution:

Pascal’s Triangle 的follow up,重點在於O(k)的空間複雜度

通過滾動數組的方式可以達到O(k)的空間複雜度

Code:

Time Complexity: O(k ^ 2)

Space Complexity: O(k)

往期精彩回顧

點擊「閱讀原文」查看數據應用學院核心課程

相關焦點